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POLYNUCLEOTIDES ENCODING ANTIGENIC HIV TYPE C 
POLYPEPTIDES, POLYPEPTIDES AND USES THEREOF 
Technical Field 

Polynucleotides encoding antigenic HIV polypeptides (e,g., those shown in 
) Table Q are described, as are uses of these polynucleotides and polypeptide products 
including formulations of inmiunogenic compositions and uses thereof. 



Background of the Invention 

Acquired immune deficiency syndrome (AIDS) is recognized as one of the 
1 0 greatest health threats facing modern medicine. There is, as yet, no cure for this 
disease. 

In 1983-1984, three groups independently identified the suspected etiological 
agent of AIDS. See, e.g., Barre-Sinoussi et al. (1983) Science 220:868-871; 
Montagnier et al., in Himian T-Cell Leukemia Viruses (Gallo, Essex & Gross, eds., 

15 1984); Vilmer et al. (1984) The Lancet 1:753; Popovic et al. (1984) Science 

224:497-500; Levy et al. (1984) Science 225:840-842. These isolates were variously 
called lymphadenopathy-associated virus (LAV), human T-cell lymphotropic virus 
type III (HTLV-III), or AIDS-associated retrovirus (ARV). All of these isolates are 
strains of the same virus, and were later collectively named Human Immunodeficiency 

20 Virus (HIV). With the isolation of a related AIDS-causing virus, the strains originally 
called HIV are now termed fflV-l and the related virus is called HIV-2 See, e.g., 
Guyader et al. (1987) Nature 326:662-669; Brun-Vezinet et aL (1986) Science 
233:343-346; Clavel et al. (1986) Nature 324:691-695. 

A great deal of information has been gathered about the HIV virus, however, 

25 to date an effective vaccine has not been identified. Several targets for vaccine 

development have been examined including the env and Gag gene products encoded 
by HTV. Gag gene products include, but are not limited to, Gag-polymerase and Gag- 
protease. Env gene products include, but are not limited to, monomeric gpl20 
polypeptides, oligomeric gpl40 polypeptides and gpl60 polypeptides. 

30 Haas, et al., (Current Biology 6(3):315-324, 1996) suggested that selective 

codon usage by HIV-1 appeared to account for a substantial fi:m:tion of the inefficiency 
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of viral protein synthesis. Andre, et al., (/. ViroL 72(2): 1497-1503, 1998) described 
an increased immune response elicited by DNA vaccination employing a synthetic 
gpl20 sequence with modified codon usage. Schneider, et aL, (J ViroL 71(7):4892- 
4903, 1997) discuss inactivation of inhibitory (or instability) elements (INS) located 
5 within the coding sequences of the Gag and Gag-protease coding sequences. 

The Gag proteins of HTV-l are necessary for the assembly of virus-like 
particles. HEV-l Gag proteins are involved in many stages of the life cycle of the virus 
including, assembly, virion maturation after particle release, and early post-entry steps 
in virus replication. The roles of HIV- 1 Gag proteins are numerous and complex 

10 (Freed, E.O., Virology 251:1-15, 1998). 

Wolf, et al., (PCT International Application, WO 96/30523, published 3 
October 1996; European Patent Application, Publication No. 0 449 1 16 Al, published 
2 October 1991) have described the use of altered pr55 Gag of HTV-l to act as a non- 
infectious retroviral-like particulate carrier, in particular, for the presentation of 

15 ImmunologicaDy important epitopes. Wang, et al., (Virology 200:524-534, 1994) 

describe a system to study assembly of KDTV Gag-P-galactosidase fusion proteins into 
virions. They describe the construction of sequences encoding HTV Gag-p- 
galactosidase fusion proteins, the expression of such sequences in the presence of HTV 
Gag proteins, and assembly of these proteins into virus particles. 

20 Shiver, et al., (PCT International implication, WO 98/34640, published 13 

August 1998) described altering fflV-l (CAMl) Gag coding sequences to produce 
synthetic DNA molecules encoding HIV Gag and modifications of HTV Gag, The 
codons of the synthetic molecules were codons preferred by a projected host cell. 

Recently, use of HTV Env polypeptides in immunogenic compositions has been 

25 described, (see, U.S. Patent No. 5,846,546 to Hurwitz et al., issued December 8, 
1998, describing mimunogenic con^ositions comprising a mixture of at least four 
different recombinant virus that each express a different HIV env variant; and U.S. 
Patent No. 5,840,313 to Vahlne et al., issued November 24, 1998, describmg peptides 
which correspond to epitopes of the HIV-1 gpl20 protein). In addition, U.S. Patent 

30 No. 5,876,731 to Sia et al, issued March 2, 1999 describes candidate vaccines against 
HTV comprising an amino acid sequence of a T-cell epitope of Gag linked directly to 
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an amino acid sequence of a B-cell epitope of the V3 loop protein of an HTV-l isolate 
containing the sequence GPGR. 

Summary of the Invention 

5 Described herein are novel HIV sequences, polypeptides encoded by these 

novel sequences, and synthetic e?tpression cassettes generated from these and other 
HIV sequences. In one aspect, the present invention relates to in^roved HTV 
expression cassettes. In a second aspect, the present invention relates to generating an 
immune response in a subject using the e)^ression cassettes of the present mvention. 

10 In a further aspect, the present invention relates to generating an immune response in a 
subject using the expression cassettes of the present invention, as well as, polypeptides 
encoded by the expression cassettes of the present invention. In another aspect, the 
present invention relates to enhanced vaccine technologies for the mduction of potent 
neutralizing antibodies and/or cellular immune responses against HTV in a subject. 

15 In certain embodiments, the present invention relates to isolated wild-type 

polynucleotides and/or expression cassettes encoding HTV polypeptides, including, but 
not limited to, Env, Gag, Pol, Prot, RT, Int, Vpr, Vpu, Vif, Nef, Tat, Rev and/or 
combinations and fragments thereof. Mutations in some of the genes are described that 
reduce or eliminate the activity of the gene product without adversely affecting the 

20 ability of the gene product to generate an immune response. Exemplary 

polynucleotides include, but are not limited to, £'^ivTV001c8.2 (SEQ ID N0:61), 
EwTVOOlcS.S (SEQ ID NO:62), £nvTV001cl2.1 (SEQ ID NO:63), Env 
TV003cE260 (SEQ ID NO:64), £hvTV004cC300 (SEQ ID NO:65), £nvTV006c9.1 
(SEQ ID NO:66), JBnvTV006c9.2 (SEQ ID NO:67), £nvTV006cE9 (SEQ ID NO:68), 

25 J?/zvTV007cB104 (SEQ ID NO:69), EnvTV007cBl05 (SEQ ID NO:70), 

J?wTV008c4.3 (SEQ ID N0:71), £;nvTV008c4.4 (SEQ ID NO:72), EnvTV010cD7 
(SEQ ID NO:73), £:«vTV012c2.1 (SEQ ID NO:74), £nvTV012c2.2 (SEQ ID 
NO:75), EnvTV013cB20 (SEQ ID NO:76). £/ivTV013cH17 (SEQ ID NO:77), 
£;7vTV014c6.3 (SEQ ID NO:78), £nvTV014c6.4 (SEQ ID NO:79), 

30 £wTV018cF1027 (SEQ ID NO:80), £nvTV019c5 (SEQ ID N0:81), GagTV001G8 
(SEQ ID NO:82), GagTVOOlGl 1 (SEQ ID NO:83), GagTV002G8 (SEQ ID NO:84), 
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GagTV003G15 (SEQ ID NO:85), GagTV004G17 (SEQ ID NO:86). GagTV004G24 
(SEQ ID NO:87), GagTV006Gll (SEQ ID NO:88), GagTV006G97 (SEQ ID 
NO:89), GflgTV007G59 (SEQ ID NO:90), GflgTV008G65 (SEQ ID N0:91), 
GflgTV008G66 (SEQ ID NO:92), GflgTV0ipG74 (SEQ ID NO:93), GagTV012G34 
5 (SEQ ID NO:94). GagTV012G40 (SEQ ID NO:95), GagTV013G2 (SEQ ID NO:96), 
GagTV013G15 (SEQ ID NO:97), GagTV014G73 (SEQ ID NO:98), 
GogTV018G60 (SEQ ID NO:99). GogTV019G20 (SEQ ID NO: 100). 
GagTV019G25 (SEQ ID NO:101), 8_2_TV1 LTR (SEQ ID NO:181). and 
2_1/4_TV12_C_ZA (SEQ ID NO:182). 

1 0 In other embodiments, the present invention relates synthetic polynucleotides 

and/or expression cassettes encodmg HIV polypeptides, including but not limited to 
Env, Gag, Pol, Prot, Int, Vpr, Vpu, Vif, Nef, Tat, Rev and/or combinations and 
fragments thereof. In addition, the present invention also relates to improved 
expression of HTV polypeptides and production of virus-like particles. Synthetic 

15 , expression cassettes encoding the HTV polypeptides (e.g., Gag-, pol-, protease (prot)-, 
reverse transcriptase, integrase, RNAseH, Tat, Rev, Nef, Vpr, Vpu, Vif and/or Env- 
containmg polypeptides) are described, as are uses of the expression cassettes. 
Mutations in some of the genes are described that reduce or eliminate the activity of 
the gene product without adversely affecting the ability of the gene product to 

20 generate an immune response. Exemplary synthetic polynucleotides include, but are 
not limited to. GagConq)lPolmut_C (SEQ ID N0:9), GagComplPolmutAtt_C (SEQ 
ID NO: 10), GagComplPolmatIna_C (SEQ ID NO: 11). 
GagComplPolmutInaTatRevNef_C (SEQ ID NO: 12), GagPolmut_C (SEQ ID 
N0:13). GagPo]mutAtt_C (SEQIDN0:14), GagPolmutlna^C (SEQ ID NO: 15), 

25 GagProtInaRTmut_C (SEQ ID NO: 16), GagProtlnaRTmutTatRevNeLC (SEQ ID 
NO: 17), GagRTmut_C (SEQ ID N0:18), GagRTmutTatRevNef_C (SEQ ID N0:19), 
GagTatRevNef_C (SEQ ID NO:20), gpl20nK)d.TVl.dell 18-210 (SEQ ID N0:21), 
gpl20niod.TVl.deIVlV2 (SEQ ID NO:22), gpl20mod.TVl.delV2 (SEQ ID NO:23). 
gpl40niod.TVl.delll8-210 (SEQ ID NO:24), gpl40mod.TVl.delVlV2 (SEQ ID 

30 NO:25), gpl40mod.TVl.deIV2 (SEQ ID NO:26); gpl40mod.TVl.mut7 (SEQ ID 
NO:27), gpl40mod.TVl.tpa2 (SEQ ID NO:28), gpl40TMmod.TVl (SEQ ID 
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NO:29), gpl60mod.TVl.dell 18-210 (SEQ ID NO:30), gpl60inod.TVl.delVlV2 
(SEQ ID N0:31), gpl60n]od.TVl.deIV2 (SEQ ID NO:32). gpl60mod.TVl.dVl 
(SEQ ID NO:33), gpl60mod.TVl.dVl-gagmDd.BW96S (SEQ ID NO:34), 
gpl60niod.TVl.dVlV2-gagiiiod.BW965 (SEQ ID NO:35), gpl60mod.TVI.dV2- 
5 gagmod.BW965 (SEQ ID NO:36), gpl60mod.TVl.tpa2 (SEQ ID NO:37), 

gpl60mod.TVl-gagmod.BW965 (SEQ ID NO:38), int.opt.mut_C (SEQ ID NO:39), 
int.opt_C (SEQ ID NO:40), nef.D106G.-myrl9.opt_C (SEQ ID NO:41). 
pl5RnaseH.opt_C (SEQ ID NO:42), p2Pol.opt.YMWM.C (SEQ ID NO:43), 
p2Polopt.YM_C (SEQ ID NO:44), p2Polopt_C (SEQ ID NO:45), p2PolTatRevNef 

10 opt C (SEQ ID NO:46), p2PolTatRevNef.opt.native_C (SEQ ID NO:47), 

p2PolTatRevNef.opt_C (SEQ ID NO:48), protInaRT.YM.opt_C (SEQ ID NO:49), 
protInaRT.YMWM.opt_C (SEQ JD NO:50), ProtRT.TatRevNef.opt_C (SEQ ID 
N0:51), rev.exonl_2.M5-10.opt_C (SEQ ID NO:52), tat.exonl_2.opt.C22-37_C 
(SEQ ID NO:53), tat.exonl_2.opt.C37_C (SEQ ID NO:54), 

1 5 TatRevNef.opt.native_ZA (SEQ ID NO:55), TatRevNef.opt_ZA (SEQ ID NO:56), 
TatRevNefGag C (SEQ ID NO:57), TatRevNefgagCpoHna C (SEQ ID NO:58), 
TatRevNefGagProtlnaRTmut C (SEQ ID NO:59), TatRevNefProtRT opt C (SEQ ID 
NO:60), gpl40.modTVl.mutl.dV2 (SEQ ID NO: 183); gpl40mod.TVl.mut2.dV2 
(SEQ ID NO: 184), gpl40mod.TVl.mut3.dV2 (SEQ ID NO: 185), 

20 gpl40mod.TVl.mut4.dV2 (SEQ ID NO:186), gpl40.mod.TVl.GM161 (SEQ ID 
NO:187), gpl40mod.TVl.GM161-195-204 (SEQ ID NO:188). 
gpl40mod.TVl.GM161-204 (SEQ ID NO: 189), gpl40mod.TVl.GM-VlV2 (SEQ ID 
NO: 190), gpl40modC8.2mut7.delV2.Kozmod.Ta (SEQ ID N0:191), and Nef- 
myrD124LLAA (SEQ ID NO:203). 

25 Thus, one aspect of the present invention relates to expression cassettes and 

potynucleotides contained therein. The expression cassettes typically include an HTV- 
polypeptide encoding sequence inserted into an ^ression vector baclcbone. In one 
embodiment, an expression cassette con^rises a polynucleotide sequence encoding 
one or more polypeptides, wherein the polynucleotide sequence conq)rises a sequence 

30 having between about 85% to 100% and any integer values therebetween, for example, 
at least about 85%, preferably about 90%, more preferably about 95%, and more 
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preferably about 98% sequence identity to the sequences taught in the present 
specification. 

The polynucleotides encoding the HIV polypeptides of the present invention 
may also include sequences encoding additional polypeptides. Such additional 
5 polynucleotides encoding polypeptides may include, for example, coding sequences for 
other viral proteins (e.g., hepatitis B or C or other HIV proteins, such as, 
polynucleotide sequences encoding an HIV Gag polypeptide, polynucleotide 
sequences encoding ap HIV Env polypeptide and/or polynucleotides encoding one or 
more of vif, vpr, tat, rev, vpu and nef); cytokines or other transgenes. 

10 In one embodiment, the sequence encoding the HIV Pol polypeptide(s) can be 

modified by deletions of coding regions corresponding to reverse transcriptase and 
integrase. Such deletions in the polymerase polypeptide can also be made such that the 
polynucleotide sequence preserves T-helper cell and CTL epitopes. Other antigens of 
interest may be inserted into the polymerase as well. 

15 In another embodiment, an expression cassette con5)rises a polynucleotide 

sequence encoding a polypeptide, for exaiiq)le, GagComplPolmut^C (SEQ ID NO:9), 
GagComplPotaiutAtt.C (SEQ ID NO: 10), GagComplPolmutIna_C (SEQ ID NO: 11), 
GagComplPolmutlnaTatRevNef^C (SEQ ED NO: 12), GagPolmut^C (SEQ ID 
NO: 13), GagPolmutAtt.C (SEQ ID NO: 14), GagPohnutIna_C (SEQ ID NO: 15), 

20 GagProtInaRTmut_C (SEQ ID NO: 16), GagProtlnaRTmutTatRevNef^C (SEQ ID 

NO: 17), GagRTmut^C (SEQ ID NO: 18), GagRTmutTatRevNef^C (SEQ ID NO: 19), 
GagTatRevNeLC (SEQ ID NO:20), gpl20mod.TVl.dell 18-210 (SEQ ID NO:21), 
gpl20mod.TVl.delVlV2 (SEQ ID NO:22), gpl20mod.TVl.delV2 (SEQ ID NO:23), 
gpl40mod.TVl.dell 18-210 (SEQ ID NO:24), gpl40mod.TVl.delVlV2 (SEQ ID 

25 NO:25). gpl40mod.TVl.delV2 (SEQ ID NO:26), gpl40mDd.TVl.raut7 (SEQ ID 
NO:27), gpl40mod.TVl.tpa2 (SEQ ID NO:28), gpl40TMmod.TVl (SEQ ID 
NO:29). gpl60mod.TVl.dell 18-210 (SEQ ID NO:30), gpl60mod.TVl.delVlV2 
(SEQ ID N0:31), gpl60mod.TVl.delV2 (SEQ ID NO:32), gpl60mod.TVl.dVl 
(SEQ ID NO:33), gpl60mod.TVl.dVl-gagmod.BW965 (SEQ ID NO:34), 

30 gpl60mod.TVl.dVlV2-gagmod.BW965 (SEQ ID NO:35), gpl60mod.TVl.dV2- 
gagmod.BW965 (SEQ ID NO:36), gpl60mod.TVLtpa2 (SEQ ID NO:37), 
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gpl60mod,TVl-gagniod.BW965 (SEQ ID NO:38), mt.opt.mut_C (SEQ ID NO:39), 
int.opt^C (SEQ ID NO:40). nef.DI06G.-myrI9.opt_C (SEQ ID N0:41), 
pl5RnaseH.opt_C (SEQ ID NO:42), p2PoLopt.YMWM_C (SEQ ID NO:43), 
p2Polopt. YM_C (SEQ ID NO:44), p2Polopt.C (SEQ ID NO:45), p2PolTatRevNef 
5 opt C (SEQ ID NO:46), p2PolTatRevNef.opt.native_C (SEQ ID NO:47), 

p2PolTatRevNef.opt_C (SEQ ID NO:48), protInaRT.YM.opt_C (SEQ ID NO:49), 
protlnaRT. YMWM.opt.C (SEQ ID NO:50), ProtRT.TatRevNef.opt^C (SEQ ID 
N0:51). rev.exonl_2.M5-10.opt.C (SEQ ID NO:52), tat.exonl_2.opt.C22-37_C 
(SEQ ID NO:53), tat.exonl_2.opt.C37.C (SEQ ID NO:54), 

10 TatRevNef.opt.native_.ZA (SEQ ID NO:55), TatRevNef.opt^ZA (SEQ ID NO:56), 
TatRevNefGag C (SEQ ID NO:57), TatRevNefgagCpoUna C (SEQ ID NO:58), 
TatRevNefGagProtlnaRTmut C (SEQ ID NO:59), and TatRevNefProtRT opt C (SEQ 
ID NO: 60), wherein the polynucleotide sequence encoding the polypeptide comprises 
a sequence having between about 85% to 100% and any integer values therebetween, 

15 for example, at least about 85%, preferably about 90%, more preferably about 95%, 
and more preferably about 98% sequence identity to the sequences taught in the 
present specification. 

The native and synthetic polynucleotide sequences encoding the HIV 
polypeptides of the present invention typically have between about 85% to 100% and 

20 any integer values therebetween, for cxampk, at least about 85%, preferably about 

90%, more preferably about 95%, and more preferably about 98% sequence identity to 
the sequences taught herein. Further, in certain embodiments, the polynucleotide 
sequences encoding the HIV polypeptides of the invention wiD exhibit 100% sequence 
identity to the sequences taught herein. 

25 The polynucleotides of the present invention can be produced by recombinant 

techniques, synthetic techniques, or combinations thereof. 

The present invention farther includes recombinant expression systems for use 
in selected host cells, wherein the recombinant expression systems en[5)loy one or more 
of the polynucleotides and expression cassettes of the present invention. In such 

30 systems, the polynucleotide sequences are operably linked to control elements 

compatible with expression in the selected host cell. Numerous expression control 
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elements are known to those in the art, including, but not limited to, the foQowing: 
transcription promoters, transcription enhancer elements, transcription termination 
signals, polyadenylation sequences, sequences for optimization of initiation of 
translation, and translation termination sequences. Exenq}lary transcription promoters 
5 include, but are not limited to those derived from CMV, CMV+intron A. S V40, RS V, 
HlV-Ltr, MMLV-ltr, and metallothionein. 

In another aspect the invention includes cells conq)rising one or more of the 
expression cassettes of the present invention where the polynucleotide sequences are 
operably linked to control elements compatible with e}qpre5sion in the selected cell In 

10 one embodiment such cells are mammalian cells. Exemplary mammalian cells include, 
but are not limited to, BHK, VERO, HT1080, 293. RD, COS-7, and CHO cells. 
Other cells, cell types, tissue types, etc., that may be useful in the practice of the 
present invention include, but are not limited to, those obtamed from the following: 
insects (e.g., Trichoplusia ni (Tn5) and Sf9), bacteria, yeast, plants, antigen presenting 

15 cells (e.g., macrophage, monocytes, dendritic cells, B-cells, T-cells, stem cells, and 
progenitor cells thereof), primary cells, immortalized cells, tumor-derived ceDs. 

In a further aspect, the present invention includes compositions for generating 
an unmunological response, where the composition typically conq)rises at least one of 
the expression cassettes of the present invention and may, for exanople, contain 

20 combinations of expression cassettes such as one or more expression cassettes carrying 
a Pol-derived-polypeptide-encoding polynucleotide, one or more e?q)ression cassettes 
carrying a Gag-derived-polypeptide-encoding polynucleotide, one or more expression 
cassettes carrying accessory polypeptide-encoding polynucleotides (e.g., native or 
synthetic vpu, vpr, nef, vif, tat, rev), and/or one or more expression cassettes carrying 

25 an Env-derived-potypeptide-encoding polynucleotide. Such compositions may further 
contain an adjuvant or adjuvants. The con5)ositions may also contain one or more 
HIV polypeptides. The HIV polypeptides may correspond to the polypeptides 
encoded by the expression cassette(s) in the composition, or may be different from 
those encoded by the expression cassettes. In compositions containing both 

30 expression cassettes (or polynucleotides of the present invention) and polypeptides, 
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various expression cassettes of the present invention can be mixed and/or matched 
with various HIV polypeptides described herein. 

In another aspect the present invention includes methods of immunization of a 
subject. In the method any of the above described compositions are into the subject 
5 under conditions that are conq)atible with expression of the expression cassette(s) in 
the subject. In one embodiment, the expression cassettes (or polynucleotides of the 
present invention) can be introduced using a gene delivery vector. The gene delivery 
vector can, for example, be a non- viral vector or a viral vector. Exemplary viral 
vectors include, but are not limited to eucaryotic layered vector initiation systems, 

10 Sindbis-virus (or other alphavirus) derived vectors, retrovkal vectors, and lentiviral 
vectors. Other exenq)lary vectors include, but are not limited to, pCMVKm2, 
pCMV6a, pCMV -link, and pCMVPLEdhfr. Compositions useful for generating an 
immunological response can also be delivered using a particulate carrier (e.g., PLG or 
CTAB-PLG microparticles). Further, such compositions can be coated on, for 

15 example, gold or tungsten particles and the coated particles delivered to the subject 
using, for example, a gene gun. The conqDOsitions can also be formulated as 
liposomes. In one embodiment of this method, the subject is a mammal and can, for 
example, be a human. 

In a further aspect, the invention includes methods of generating an immune 

20 response in a subject. Any of the expression cassettes described herein can be 
expressed in a suitable cell to provide for the expression of the HTV polypeptides 
encoded by the polynucleotides of the present invention. The polypeptide(s) are then 
isolated (e.g., substantially purified) and administered to the subject in an amount 
sufficient to elicit an immune response. In certain embodiments, the methods conf^rise 

25 administration of one or more of the expression cassettes or polynucleotides of the 

present invention, using any of the gene delivery techniques described herein. In other 
embodiments, the methods comprise co-administration of one or more of the 
expression cassettes or polynucleotides of the present invention and one or more 
polypeptides, wherein the polypeptides can be expressed from these polynucleotides or 

30 can be other HIV polypeptides. In other embodiments, the methods comprise co- 
administration of multiple expression cassettes or polynucleotides of the present 
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invention. In still further embodiments, the methods comprise co-administration of 
multiple polypeptides, for example polypeptides expressed from the polynucleotides of 
the present invention and/or other HIV polypeptides. 

The invention further includes methods of generatmg an immune response m a 
S subject, where cells of a subject are transfected with any of the above-described 

expression cassettes or polynucleotides of the present invention, under conditions that 
permit the expression of a selected polynucleotide and production of a polypeptide of 
interest (e.g., encoded by any expression cassette of the present invention). By this 
method an immunological response to the polypeptide is eUcited in the subject. 

10 Transfection of the ceDs may be performed ex vivo and the transfected cells are 

reintroduced into the subject. Alternately, or in addition, the cells may be transfected 
in vivo in the subject. The immune response may be humoral and/or cell-mediated 
(cellular). In a further embodhnent, this method may also include administration of an 
HIV polypeptides before, concurrently with, and/or after introduction of the 

1 5 expression cassette into the subject. 

The polynucleotides of the present invention may be employed singly or in 
combination. The polynucleotides of the present invention, encoding HIV-derived 
polypeptides, may be expressed in a variety of ways, including, but not limited to the 
following: a polynucleotide encoding a single gene product (or portion thereof) 

20 expressed from a promoter; multiple polynucleotides encoding a more than one gene 
product (or portion thereof) (e.g., polycistronic coding sequences); multiple 
polynucleotides in-frame to produce a single polyprotein; and, multiple polynucleotides 
in-frame to produce a single polyprotein wherein the polyprotein has protein cleavage 
sites between one or more of the polypeptides con5)rising the polyprotein. 

25 These and other embodiments of the present invention will readily occur to 

those of ordinary skill in the art in view of the disclosure herein. 



Brief Description OF THE Figures 

Figures lA to ID depict the nucleotide sequence of HIV Type C 
30 8^5„TV1_C.ZA ( SEQ ID NO: 1 ; referred to herein as TVl). Various regions are 
shown in Table A. 
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Figures 2A-C depicts an alignment of Env polypeptides from various HTV 
isolates (SF162, SEQ ID N0:2; TV1.8_2, SEQ ID N0:3; TV1.8_5. SEQ ID N0:4; 
TV2. 12-5/1, SEQ ID N0:5; Consensus Sequence, SEQ ID N0:6). The regions 
between the aiTows indicate regions (of TVl and TV2 clones, both HIV Type C 
S isolates) in the beta and/or bridging sheet region(s) that can be deleted and/or 

truncated. The denotes N-linked glycosylation sites (of TVl and TV2 clones), one 
or more of which can be modified (e.g., deleted and/or mutated). 

Figure 3 presents a schematic diagram showing the relationships between the 
following forms of the HIV Env polypeptide: gpl60, gpl40, gpl20, and gp41. 
1 0 Figure 4 presents exenq)lary data concerning transactivation activity of Tat 

mutants on LTR-CAT plasmid expression in 293 cells. 

Figure 5 presents exemplary data concerning export activity of Rev mutants 
monitored by CAT expression. 

Figure 6, sheets 1 and 2, presents the sequence of the construct 
15 GagComplPolmut_C (SEQ ID N0:9). 

Figure 7, sheets 1 and 2, presents the sequence of the construct 
GagCompPobnutAtt.C (SEQ ID NO: 10). 

Figure 8, sheets 1 and 2, presents the sequence of the construct 
GagComplPolmutlna^C (SEQ ID NO: 11). 
20 Figure 9, sheets 1 and 2, presents the sequence of the construct 

GagComplPohnutInaTatRevNef_C (SEQ ID NO: 12). 

Figure 10, presents the sequence of the construct GagPobnut_C (SEQ ID 
NO:13). 

Figure 1 1, presents the sequence of the construct GagPolmutAtt^C (SEQ ID 
25 NO: 14). 

Figure 12, presents the sequence of the construct GagPolmutIna_C (SEQ ID 
NO: 15). 

Figure 13, presents the sequence of the construct GagProtInaRTmut_C (SEQ 
ID NO: 16). 

30 Figure 14, sheets 1 and 2, presents the sequence of the construct 

GagProdnaRTmutTatRevNelLC (SEQ ID NO: 17). 
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Figure 15, presents the sequence of the construct GagRTmut_C (SEQ ID 
NO: 18). 

Figure 16, sheets 1 and 2, presents the sequence of the construct 
GagRTmutTatRevNef^C (SEQ ID NO: 19). 
5 Figure 17, presents the sequence of the construct GagTatRevNeLC (SEQ ID 

NO:20). 

Figure 18, presents the sequence of the construct gpl20naiodTVl.dell 18-210 
(SEQIDN0:21). 

Figure 19, presents the sequence of the construct gpl20mod.TVLdelVlV2 
10 (SEQIDNO:22). 

Figure 20, presents the sequence of the construct gpl20mod.TVl.delV2 (SEQ 
IDNO:23). 

Figure 21, presents the sequence of the construct gpl40mod.TVl.dell 18-210 

(SEQIDNO:24). 

15 Figure 22, presents the sequence of the construct gpl40mod.TVl.delVlV2 

(SEQIDNO:25). 

Figure 23, presents the sequence of the construct gpl40mod.TVLdelV2 (SEQ 
ID NO:26). 

Figure 24, presents the sequence of the construct gpl40mod,TVLmut7 (SEQ 
20 ID NO:27). 

Figure 25, presents the sequence of the construct gpl40mod.TVl.tpa2 (SEQ 
ID NO:28). 

Figure 26, presents the sequence of the construct gpl40TMmod.TVl (SEQ ID 
NO:29). 

25 Figure 27, presents the sequence of the construct gpl60mod.TVl.dell 18-210 

(SEQ ID NO:30). 

Figure 28, presents the sequence of the construct gpl60mod.TVl.delVlV2 
(SEQIDNO:31). 

Figure 29, presents the sequence of the construct gpl60mod.TVl.delV2 (SEQ 
30 IDNO:32). 
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Figure 30, presents the sequence of the construct gpl60mod.TVl.dVl (SEQ 
E)NO:33). 

Figure 31, sheets 1 and 2, presents the sequence of the construct 
gpl60mod.TVl.dVl-gagmod,BW965 (SEQ E) NO:34). 
S Figure 32, sheets 1 and 2, presents the sequence of the construct 

gpl60mod.TVl.dVlV2-gagniod.BW965 (SEQ ID NO:35). 

Figure 33, sheets 1 and 2, presents the sequence of the construct 
gpl60mod.TVl.dV2-gagmod.BW965 (SEQ ID NO:36). 

Figure 34, presents the sequence of the construct gpl60mod.TVl.tpa2 (SEQ 
10 IDNO:37). 

Figure 35, sheets 1 and 2, presents the sequence of the construct 
gpl60mod.TVl-gagmod.BW965 (SEQ ID NO:38). 

Figure 36, presents the sequence of the construct int.opt.mut^C (SEQ JD 
NO:39). 

1 5 Figure 37, presents the sequence of the construct int.opt_C (SEQ ID NO:40). 

Figure 38, presents the sequence of the construct nef,D106G.-myrl9.opt_C 
(SEQIDNO:41). 

Figure 39, presents the sequence of the construct pl5RnaseH.opt_C (SEQ ID 
NO:42). 

20 Figure 40, presents the sequence of the construct p2Pol.opt. YMWM_C (SEQ 

ID NO:43). 

Figure 41, presents the sequence of the construct p2Polopt.YM_C (SEQ ID 
NO:44). 

Figure 42, presents the sequence of the construct p2Polopt_C (SEQ ID 
25 NO:45). 

Figure 43. presents the sequence of the construct p2PolTatRevNef opt C (SEQ 
IDNO:46). 

Figure 44, presents the sequence of the construct 
p2PolTatRevNef.opt.native_C (SEQ ID NO:47). 
30 Figure 45, presents the sequence of the construct p2PolTatRevNef.opt_C 

(SEQIDNO:48). 
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Figure 46, presents the sequence of the construct protlhaRT.YM.opt^C (SEQ 
IDNO:49). 

Figure 47, presents the sequence of the construct protInaRT.YMWM.opt_C 
(SEQ ID NO:50). 

5 Figure 48, presents the sequence of the construct ProtRT.TatRevNef.opt_C 

(SEQIDN0:51). 

Figure 49, presents the sequence of the construct rev.exonl_2.M5"10.opt_C 
(SEQIDNO:52). 

Figure 50, presents the sequence of the construct tat.exonl_2.opt.C22-37_C 
10 (SEQIDNO:53). 

Figure 51, presents the sequence of the construct tat.exonlJ2.opt.C37„C (SEQ 
IDNO:54). 

Figure 52, presents the sequence of the construct TatRevNef .opt.native_ZA 
(SEQIDNO:55). 

15 Figure 53, presents the sequence of the construct TatRevNef.opt_ZA (SEQ ID 

NO:56). 

Figure 54, presents the sequence of the construct TatRevNefGag C (SEQ ID 
NO:57). 

Figure 55, sheets 1 and 2, presents the sequence of the construct 
20 TatRevNefgagCpolIna C (SEQ ID NO:58). 

Figure 56, sheets 1 and 2, presents the sequence of the construct 
TatRevNefGagProtlnaRTmut C (SEQ ID NO:59). 

Figure 57, presents the sequence of the construct TatRevNefProtRT opt C 
(SEQIDNO:60). 

25 Figure 58 presents the sequence of Env of clone TV001c8.2 of isolate C- 

98TV001 (SEQ ID N0:61). 

Figure 59 presents the sequence of Env of clone TV001c8.5 of isolate C- 
98TV001 (SEQ ID NO:62). 

Figure 60 presents the sequence of Env of clone TV001cl2. 1 of isolate C- 
30 98TV002 (SEQ ID NO:63). 
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Figure 61 presents the sequence 
98TV003 (SEQIDNO:64). 

Figure 62 presents the sequence 
98TV004 (SEQIDNO:65). 
5 Figure 63 presents the sequence 

98TV006 (SEQIDNO:66). 

Figure 64 presents the sequence 
98TV006 (SEQIDNO:67). 

Figure 65 presents the sequence 
10 98TV006 (SEQIDNO:68), 

Figure 66 presents the sequence 
98TV007 (SEQIDNO:69). 

Figure 67 presents the sequence 
98TV007 (SEQIDNO:70). 
15 Figure 68 presents the sequence 

98TV008 (SEQIDNO:71). 

Figure 69 presents the sequence 
98TV008 (SEQ ID NO:72). 

Figure 70 presents the sequence 
20 98TV010 (SEQIDNO:73). 

Figure 71 presents the sequence 
98TV012 (SEQ ID NO:74). 

Figure 72 presents the sequence 
98TV012 (SEQ ID NO:75). 
25 Figure 73 presents the sequence 

98TV013 (SEQIDNO:76). 

Figure 74 presents the sequence 
98TV013 (SEQIDNO:77). 

Figure 75 presents the sequence 
30 98TV014 (SEQIDNO:78). 



PCTAJS02/21420 
of Env of clone TV003cE260 of isolate C- 

of Env of clone TV004cC300 of isolate C- 

of Env of clone TV006c9.1 of isolate C- 

of Env of clone TV006c9.2 of isolate C- 

of Env of clone TV006cE9 of isolate C- 

of Env of clone TV007cB104 of isolate C- 

of Env of clone TV007cB 105 of isolate C- 

of Env of clone TV008c4,3 of isolate C- 

of Env of clone TV008c4.4 of isolate C- 

of £nv of clone TV010cD7 of isolate C- 

of Env of clone TV012c2. 1 of isolate C- 

of Env of clone TV012c2.2 of isolate C- 

of Env of clone TV013cB20 of isolate C- 

of Env of clone TV013cH17 of isolate C- 

of Env of clone TV014c6.3 of isolate C- 
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Figure 76 presents the sequence 
98TV014 (SEQroNO:79). 

Figure 77 presents the sequence 
98TV018 (SEQIDNO:80). 
5 Figure 78 presents the sequence 

98TV0I9(SEQIDNO:81). 

Figure 79 presents the sequence 
98TV001 (SEQIDNO:82). 

Figure 80 presents the sequence 
10 98TV001 (SEQroNO:83). 

Figure 81 presents the sequence 
98TV002 (SEQK)NO:84). 

Figure 82 presents the sequence 
98TV003 (SEQIDNO:85). 
15 Figure 83 presents the sequence 

98TV004 (SEQIDNO:86), 

Figure 84 presents the sequence 
98TV004 (SEQIDNO:87). 

Figure 85 presents the sequence 
20 98TV006 (SEQ ID NO:88). 

Figure 86 presents the sequence 
98TV006 (SEQ ID NO: 89). 

Figure 87 presents the sequence 
98TV009 (SEQIDNO:90). 
25 Figure 88 presents the sequence 

98TV008 (SEQIDNO:91). 

Figure 89 presents the sequence 
98TV008 (SEQIDNO:92). 

Figure 90 presents the sequence 
30 98TV010 (SEQE)NO:93). 



PCTAJS02/21420 
of Env of clone TV014c6.4 of isolate C- 

of Env of clone TV018cF1027 of isolate C- 

of Env of clone TV019c5 of isolate C- 

of Gag of clone TV001G8 of isolate C- 

of Gag of clone TVOOIGI 1 of isolate C- 

of Gag of clone TV002G8 of isolate C- 

of Gag of clone TV003G15 of isolate C- 

of Gag of clone TV004G17 of isolate C- 

of Gag of clone TV004G24 of isolate C- 

of Gag of clone TV006G1 1 of isolate C- 

of Gag of clone TV006G97 of isolate C- 

of Gag of clone TV007G59 of isolate C- 

of Gag of clone TV008G65 of isolate C- 

of Gag of clone TV008G66 of isolate C- 

of Gag of clone TV010G74 of isolate C- 
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Figure 91 presents the sequence of Gag of clone TV012G34 of isolate C- 
98TV012 (SEQ ID NO:94). 

Figure 92 presents the sequence of Gag of clone TV012G40 of isolate C- 
98TV012 (SEQIDNO:95). 
5 Figure 93 presents the sequence of Gag of clone TV013G2 of isolate C- 

98TV013 (SEQIDNO:96). 

Figure 94 presents the sequence of Gag of clone TV013G15 of isolate C- 
98TV013 (SEQIDNO:97). 

Figure 95 presents the sequence of Gag of clone TV014G73 of isolate C- 
10 98TV014 (SEQ ID NO:98). 

Figure 96 presents the sequence of Gag of clone TV018G60 of isolate C- 
98TV0r8 (SEQIDNO:99). 

Figure 97 presents the sequence of Gag of clone TV019G20 of isolate C- 
98TV019 (SEQ ID NO: 100). 
15 Figure 98 presents the sequence of Gag of clone TV019G25 of isolate C- 

98TV019 (SEQ ED NO: 101). 

Figures 99al, 99a2, 99b and 99c depict alignments of the deduced amino acid 
sequences of Nef (Fig. 99al and 99a2), Tat (Fig. 99b) and Rev (Fig. 99c) from South 
African subtype C isolates (TVOOl (SEQ ID NO: 102 for Nef, SEQ ID NO:206, for 
20 Tat and SEQ ID NO:230 for Rev); TV002 (SEQ ID NO: 103, SEQ ID NO:207 for Tat 
and SEQ ID NO:231 for Rev); TV003 (SEQ ID NO: 104 for Nef, SEQ ID NO:208 for 
Tat, SEQ ID NO:232 for Rev); TV004 (SEQ ID NO: 105 for Nef, SEQ ID NO:209 
for Tat and SEQ ID NO:233 for Rev); TV005 (SEQ ID NO: 106 for Nef, SEQ ID 
NO:210 for Tat and SEQ ID NO:234 for Rev; TV006 (SEQ ID NO: 107 for Nef, SEQ 
25 ID N0:21 1 for Tat and SEQ ID NO:235 for Rev); TV007 (SEQ ID NO: 108 for Nef, 
SEQ ID NO:212 for Tat and SEQ ID NO:236 for Rev); TV008 (SEQ ID NO: 109 for 
Nef, SEQ ID NO:213 for Tat and SEQ ID NO:237 for Rev); TVOlO (SEQ ID 
NO:l 10 for Nef, SEQ ID NO:214 for Tat and SEQ ID NO:238 for Rev); TV012 
(SEQ ID NO: 111 for Nef, SEQ ID NO:215 for Tat and SEQ ID NO:239 for Rev); 
30 TV013 (SEQ ID NO: 1 12 for Nef, SEQ ID NO:216 for Tat and SEQ ID NO:240 for 
Rev); TV014 (SEQ ID NO: 1 13 for Nef, SEQ ID NO:217 for Tat and SEQ ID 
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NO:241 for Rev); TV018 (SEQ ID NO: 1 14 for Nef, SEQ ID NO:218 for Tat and 
SEQ ID NO:242 for Rev); TV019 (SEQ ID NO: 1 15 for Nef, SEQ ID NO:219 for Tat 
and SEQ ID NO:243 for Rev)) in conjunction with some subtype C reference strains 
(92BR025 (SEQ ID N0:1 16 for Nef, SEQ ID NO:220 for Tat and SEQ ID NO:244 
5 for Rev); 301904-Ind (SEQ ID NO:117 for Nef, SEQ ID NO:221 for Tat and SEQ ID 
NO:245 for Rev); 301905-Ind (SEQ ID N0:118 for Nef. SEQ ID NO:222 for Tat and 
SEQ ID NO:246 for Rev); 30199-Ind (SEQ ID N0:1 19 for Nef, SEQ ID NO:223 for 
Tat and SEQ ID N0:247 for Rev); 96BW16-D14 (SEQ ID NO: 120 for Nef, SEQ ID 
NO:224 for Tat and SEQ ID NO:248 for Rev); 96BW04-09 (SEQ ID N0:121 for 

10 Nef, SEQ ID NO:225 for Tat and SEQ ID NO:249 for Rev); 96BW12-10 (SEQ ID 
NO: 122 for Nef, SEQ ID NO:226 for Tat and SEQ ID NO:250 for Rev); C2220-Eth 
(SEQ ID NO:123 for Nef, SEQ ID NO:227 for Tat and SEQ ID NO:251 for Rev)) as 
well as the subtype B reference strain HXB2 (SEQ ID NO: 124 for Nef, SEQ ID 
NO:228 for Tat and SEQ ID NO:252 for Rev). Consensus sequence is shown at the 

1 5 bottom (SEQ ID NO: 125 for Nef, SEQ ID NO:229 for Tat and SEQ ID NO:253 for 
Rev). Dots represent identical residue sequences, dashes represent gaps and asterisks 
represent stop codons. Significant protein domains and conserved motifs are shaded 
and labeled. 

Figure 100, sheets 1 to 9, depicts alignment of the con5>lete Env protem from 
20 South African HTV-l subtype C sequences (TV001c8.2 (SEQ ID NO: 126); 

TVOOlcS.l (SEQ ID NO:127); TV002cl2.1 (SEQ ID NO:128); TV012c2.1 (SEQ ID 
NO:129); TV012c2.2 (SEQ ID NO:130); TV006c9.1 (SEQ ID N0:131); TV006cE9 
(SEQ ID NO: 132); TV006c9.2 (SEQ ID NO:133); TV007cB104 (SEQ ID NO:134); 
TV007cB105 (SEQ ID NO: 135); TV010cD7 (SEQ ID NO: 136); TV018cF1027 (SEQ 
25 ID NO: 137); TV014c6.3 (SEQ ID NO: 138); TV014c6.4 (SEQ ID NO: 139); 

TV008o4.3 (SEQ ID NO: 140); TV008c4.4 (SEQ ID NO: 141); TV019c5 (SEQ ID 
NO: 142); TV003cE260 (SEQ ID NO: 143); TV004cC300 (SEQ ID NO: 144); 
TV013cH17 (SEQ ID NO: 145); TV013cB20 (SEQ ID NO: 146)) compared to the 
subtype C reference strains: IN21068 (SEQ ID NO: 147), 96BW05.02 (SEQ ID 
30 NO: 148), ETH2220 (SEQ ID NO: 149), and 92BR025.8 (SEQ ID NO: 150) from the 
Los Alamos Database. Dots denote sequence identity with the IN21068 sequence, 
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while dashes represent gaps introduced to optimize aUgnments. Carets indicate 
possible glycosylation sites present in most of the sequences. Asterisks show positions 
of cysteine residues. The VI, V2, V3, V4 and V5 variable loops, as well as the signal 
peptide and CD4 binding residues and sites are indicated above the sequences. 
5 Triangles at positions 1 1, 25 and 35 of the V3 loop indicate amino acids assessed for 
SI / NSI phenotype. 

Figure 101, sheets 1 to 3, depicts alignments of the deduced (A) Vif, (B), Vpr , 
and (Q Vpu amino acid sequences from South African subtype C isolates (in boldface, 
TV007-6 (SEQ ID N0:151 for Vif, SBQ ID NO:254 for Vpr and SEQ ID NO:288 for 

10 Vpu); TV007-2 (SEQ ID NO: 152 for Vif, SEQ ID NO:255 for Vpr and SEQ ID 

NO:289 for Vpu); TV019-82 (SEQ ID NO: 153 for Vif, SEQ ID NO:256 for Vpr and 
SEQ ID NO:290 for Vpu); TV019-85 (SEQ ID NO:154 for Vif, SEQ ID NO:257 for 
Vpr and SEQ ID NO:291 for Vpu); TV008-17 (SEQ NO:155 for Vif, SEQ ID 
NO:258 for Vpr and SEQ ID NO:292 for Vpu); TV008-1 (SEQ ID NO: 156 for Vif, 

15 SEQ ID NO:259 for Vpr and SEQ ID NO:293 for Vpu); TV014-25 (SEQ ID NO:157 
for Vif, SEQ ID NO:260 for Vpr and SEQ ID NO:294 for Vpu); TV014-31 (SEQ ID 
NO: 158 for Vif, SEQ ID NO:261 for Vpr and SEQ ID NO:295 for Vpu); TV004-45 
(SEQ ID NO: 159 for Vif, SEQ ID NO:262 for Vpr and SEQ ID NO:296 for Vpu); 
TVOOl-2 (SEQ ID NO: 160 for Vif, SEQ ID NO:263 for Vpr and SEQ ID NO:297 for 

20 Vpu); TV018-7 (SEQ ED NO:286 for Vif, SEQ ID NO:264 for Vpr and SEQ ID 

NO:298 for Vpu); TV018-8 (SEQ ID NO: 161 for Vif, SEQ ID NO:265 for Vpr and 
SEQ ID NO:299 for Vpu); TV002-84 (SEQ ID NO: 162 for Vif, SEQ ID NO:266 for 
Vpr and SEQ ID NO:300 for Vpu); TV009-3 (SEQ ID NO: 163 for Vif, SEQ ID 
NO:267 for Vpr and SEQ ID NO:301 for Vpu); TV013-2 (SEQ ID NO:164 for Vif, 

25 SEQ ID NO:268 for Vpr and SEQ ID NO:302 for Vpu); TV013-3 (SEQ ID NO: 165 
for Vif, SEQ ID NO:269 for Vpr and SEQ ID NO:303 for Vpu); TV003-12 (SBQ ID 
NO: 166 for Vif, SEQ ID NO:270 for Vpr and SEQ ID NO:304 for Vpu); TV003-B 
(SEQ ID NO: 167 for Vif, SEQ ID NO:271 for Vpr and SEQ ID NO:305 for Vpu); 
TV005-81 (SEQ ID NO: 168 for Vif, SEQ ID NO:272 for Vpr and SEQ ID NO:306 

30 for Vpu); TV012-4 (SBQ ID NO:169 for Vif, SEQ ID NO:273 for Vpr and SBQ ID 
NO:307 for Vpu); TV006-9 (SEQ ID NO: 170 for Vif, SBQ ID NO:274 for Vpr and 
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SEQ ID NO:308 for Vpu); TVOlO-25 (SEQ ID N0:171 for Vif, SEQ ID NO:275 for 
Vpr and SEQ ID NO:309 for Vpu) in conjunction with son» subtype C reference 
strains 92BR025 (SEQ ID NO: 172 for Vif. SEQ ID NO:276 for Vpr and SEQ ID 
NO:310 for Vpu); 301904-Ind (SEQ ID NO: 173 for Vif, SEQ ID NO:277 for Vpr and 
5 SEQ E) N0:311 for Vpu); 301905-Ind (SEQ ID NO: 174 for Vif, SEQ ID NO:278 for 
Vpr and SEQ ID NO:312 for Vpu); 30199-Ind (SEQ ID NO: 175 for Vif, SEQ ID 
NO:279 for Vpr and SEQ ID NO:313 for Vpu); 96BW16-D14 (SEQ ID NO: 176 for 
Vif, SEQ ID NO:280 for Vpr and SEQ ID NO:314 for Vpu); 96BW04-09 (SEQ ID 
NO: 177 for Vif, SEQ ID NO:281 for Vpr and SEQ ID NO:315 for Vpu); 96BW12-10 

10 (SEQ ID NO: 178 for Vif, SEQ ID NO:282 for Vpr and SEQ ID NO:316 for Vpu); 
C2220-Eth (SEQ ID NO:179 for Vif. SEQ ID NO:283 for Vpr and SEQ ID NO:317 
for Vpu)) as weU as HXB2 (SEQ ID NO: 180 for Vif. SEQ ID NO:284 for Vpr and 
SEQ ID NO:318 for Vpu). Consensus sequences are shown as SEQ ID NO:287 for 
Vif, SEQ ID NO:285 for Vpr and SEQ ID NO:3 19 for Vpu. 

15 Figure 102, sheets 1 and 2, depicts the nucleotide sequence of from the 3' 

region of the clone designated 8_2_TV1 (SEQ ID NO:181). 

Figure 103, sheets 1 to 5, depicts the nucleotide sequence of 
2_1/4_TV12_C_ZA (SEQ ID NO:182). 

Figure 104 depicts the nucleotide sequence of gpl40.inodTVl.mutl.dV2 (SEQ 

20 ID NO: 183). 

Figure 105 depicts the nucleotide sequence of gpl40mod.TVl.mut2.dV2 (SEQ 
ID NO: 184). 

Figure 106 depicts the nucleotide sequence of gpl40mod.TVl.inut3.dV2 (SEQ 
ID NO: 185). 

25 Figure 1 07 depicts the nucleotide sequence of gp 1 40mod.TV 1 .mut4.dV2 (SEQ 

ID NO: 186). 

Figure 108 depicts the nucleotide sequence of gpl40.mod.TVl.GM161 (SEQ 
ID NO: 187). 

Figure 109 depicts the nucleotide sequence of gpl40mod.TVl.GM161-195- 
30 204 (SEQ ID NO: 188). 
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Figure 1 10 depicts the nucleotide sequence of gpl40mod.TVl.GM161-204 
(SEQ1PN0:189). 

Figure 111 depicts the nucleotide sequence of gpl40mod.TVl.GM-VlV2 
(SEQIDNO:190). 
5 Figure 112 depicts the nucleotide sequence of 

gpl40modC8.2mut7.delV2.KozmodTa (SEQ ID N0:191). 

Figure 113 depicts alignment of the amino acid sequences of various Env 
cleavage site mutants (translation of gpl40mod.TVl.delV2 (SEQ ID NO:192); 
translation of gpl40moATVl.mutl,dV2 (SEQ ID NO:193); translation of 
10 gpl40mod.TVl.mut2.dV2 (SEQ ID NO: 194); translation of 
gpl40mod.TVl.mut3.dV2 (SEQ ID NO: 195); translation of 
gpl40mod.TVl.mut4.dV2 (SEQ ED NO:196); and translation of 
gpl40mod.TVl.mut7.dV2 (SEQ ID NO: 197)). Amino acid changes are shown in 
bold. 

15 Figure 1 14 depicts alignment of amino acid sequences of various Env 

glycosylation mutants (GM), including translation of gpl40mod.TVl (SEQ ID 
NO:198); translation of gpl40mod.TVl.GM161 (SEQ ID NO:199); translation of 
gpl40mod.TVl.GM161-204 (SEQ ID NO:200); translation of 
gpl40mod.TVl.GM161-195-204 (SEQ ID NO:201); and translation of 

20 gpl40mod.TVl.GM-VlV2 (SEQ ID NO:202). 

Figure 1 15 depicts the nucleotide sequence of Nef-myrD124LLAA (SEQ ID 
NO:203). 

Figure 1 16 depicts the amino acid sequence of the protein translated (SEQ ID 
NO:204) fromNef-myrD124LLAA. 
25 Figure 1 17 depicts the nucleotide sequence of gpl60mod.TV2 (SEQ ID 

NO:205). 

Figure 118 presents an overview of genome organization of HIV-1 and useful 
subgenomic fragments. 

Figure 119 is a graph depicting log geometric mean antibody titers in 
30 immunized rabbbits following immunization with Env DNA and protein. 
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Figure 120 is a bar graph depicting comparison of ELISA titers against subtype 
B and C Env proteins in rabbit sera collected after 3 DNA immunizations and a single 
protein boost. 

Figure 121 presents data of neutralizing antibody responses against subtype B 
5 SF162 EnvdV2 strain in rabbits immunized with subtype C TVl Env in a DNA prime 
protein boost regimen. 

Figure 122 presents data of neutralizing antibody responses against subtype C 
primary strains, TVl and TV2 in 5.25 reporter cell assay after a single protein boost. 
Figure 123 presents data of neutralizing antibody responses against subtype C, 
10 TVl and Dul74, and subtype B, SF162 after a single protein boost (as measured by 
Duke PBMC assay). 



Detailed Description of the Invention 

The practice of the present invention will employ, unless otherwise indicated, 
15 conventional methods of chemistry, biochemistry, molecular biology, immunology and 
pharmacology, within the skill of the art. Such techniques are explamed fiilly in the 
literature. See, e.g.. Remington's Pharmaceutical Sciences, 18th Edition (Easton, 
Pennsylvania: Mack Publishing Company, 1990); Methods In Enzymology (S. 
Colowick and N. Kaplan, eds.. Academic Press, Inc.); and Handbook of Experimental 
20 Immunology, Vols. I-IV (DM. Weir and C.C. Blackwell, eds., 1986, BlackweU 

Scientific Publications); Sambrook, et al.. Molecular Cloning: A Laboratory Manual 
(2nd Edition, 1989); Short Protocols in Molecular Biology, 4th ed. (Ausubel et al, 
eds., 1999, John Wiley & Sons); Molecular Biology Techniques: An Intensive 
Laboratory Course, (Ream et aL, eds., 1998, Academic Press); PCR (Introduction to 
25 Biotechniques Series), 2nd ed. (Newton & Graham eds., 1997, Springer Verlag). 

As used in this specification, the singular forms "a," "an" and *the" include 
plural references unless the content clearly dictates otherwise. Thus, for example, 
reference to "an antigen" includes a mixture of two or more such agents. 
1. Definitions 

30 In describing the present invention, the following terms will be employed, and 

are intended to be defined as indicated below. 
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"Synthetic" sequences, as used herein, refers to HIV polyp^ptidG-encoding 
polynucleotides whose expression has been modified as described herein, for example, 
by codon substitution, altered activities, and/or inactivation of inhibitory sequences. 
'Wild-type** or "native" sequences, as used herein, refers to polypeptide encoding 
5 sequences that are essentially as they are found in nature, e.g.. Gag, Pol, Vif, Vpr, Tat, 
Rev, Vpu, Env and/or Nef encoding sequences as found in HTV isolates, e,g., SF162, 
SF2, AFl 10965, AFl 10967, AFl 10968, AFl 10975, 8_5^TV1_CZA, 
8,2_TVUC.ZA or 12-5.1_TV2_C.ZA. The various regions of the HIV genome are 
shown in Table A, with numbering relative to 8_5_TV1_C.ZA (Figures lA-lD). 

10 Thus, the term 'Tol" refers to one or more of the following polypeptides: polymerase 
(p6Pol); protease (prot); reverse transcriptase (p66RT or Rf ); RNAseH 
(pl5RNAseH); and/or integrase (p31Int or Int). Identification of gene regions for any 
selected HTV isolate can be performed by one of ordinary skill in the art based on the 
teachings presented herein and the information known in the art, for example, by 

15 performing alignments relative to 8_5_TVLC.ZA (Figures lA-lD) or alignment to 
other known HIV isolates, for example, Subtype B isolates with gene regions (e.g., 
SF2, GenBank Accession number K02007; SF162, GenBank Accession Number 
M38428) and Subtype C isolates with gene regions (e.g., GenBank Accession Number 
AFl 10965 and GenBank Accession Number AFl 10975). 

20 As used herein, the term ^Virus-like particle" or ^'VLP" refers to a 

nonreplicating, viral shell, derived from any of several viruses discussed further below. 
VLPs are generally composed of one or more viral proteins, such as, but not limited to 
those proteins referred to as capsid, coat, shell, surface and/or envelope proteins, or 
pai ticle-forming polypeptides derived from these proteins. VLPs can form 

25 spontaneously upon recombinant expression of the protein in an appropriate 

expression system. Methods for producing particular VLPs are known in the art and 
discussed more fully below. The presence of VLPs following recombinant expression 
of viral proteins can be detected using conventional techniques known in the art, such 
as by electron microscopy, X-ray crystallography, and the like. See, e.g., Baker et al., 

30 Biophys. J. (1991) 60: 1445-1456; Hagensee et al., /. Virol. (1994) 68:4503-4505. 

For example, VLPs can be isolated by density gradient centrifugation and/or identified 
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by characteristic density banding. Alternatively, cryoelectron microscopy can be 
performed on vitrified aqueous sanq)les of the VLP preparation in question, and 
images recorded under appropriate exposure conditions. 

By "particle-forming polypeptide" derived from a particular viral protein is 
5 meant a full-length or near full-length viral protein, as well as a fragment thereof, or a 
viral protein with internal deletions, which has the ability to form VLPs under 
conditions that favor VLP formation. Accordingly, the polypeptide may comprise the 
friU-length sequence, fragments, truncated and partial sequences, as well as analogs 
and precursor forms of the reference molecule. The term therefore intends deletions, 

10 additions and substitutions to the sequence, so long as the polypeptide retains the 
abihty to form a VLP. Thus, the term inchides natural variations of the specified 
polypeptide since variations in coat proteins often occur between viral isolates. The 
term also includes deletions, additions and substitutions that do not naturally occur in 
the reference protein, so long as the protein retains the ability to form a VLP. 

15 Preferred substitutions are those which are conservative in nature, i.e., those 

substitutions that take place within a family of amino acids that are related in their side 
chains. Specifically, amino acids are generally divided into four families: (1) acidic - 
aspartate and glutamate; (2) basic — lysme, arginine, histidine; (3) non-polar — alanine, 
valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan; and (4) 

20 uncharged polar - glycine, asparagine, glutamine, cystine, serine threonine, tyrosine. 
Phenylalanine, tryptophan, and tyrosine are sometimes classffied as aromatic amino 
acids. 

The term *mv polypeptide" refers to any amino acid sequence that exhibits 
sequence homology to native HIV polypeptides (e.g„ Gag, Env, Prot, Pol, RT, Int, vif, 
^. 25 vpr, vpu, tat, rev, nef and/or combinations thereof) and/or which is functional. Non- 
limiting exajnq)les of functions that may be exhibited by HIV polypeptides include, use 
as knmunogens {e.g., to generate a humoral and/or cellular immune response), use in 
diagnostics (e.g, bound by suitable antibodies for use in ELISAs or other 
immunoassays) and/or polypeptides which exhibit one or more biological activities 
30 associated with the wild type or synthetic HIV polypeptide. For example, as used 

herein, the term "Gag polypeptide" may refer to a polypeptide that is bound by one or 
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more anti-Gag antibodies; elicits a humoral and/or cellular immune response; and/or 
exhibits the ability to form particles. 

An "antigen" refers to a molecule containing one or more epitopes (either 
linear, conformational or both) that will stimulate a host's immune system to make a 
S humoral and/or cellular antigen-specific response. The term is used interchangeably 
with the term "inomunogen." Normally, a B-cell q)itope will include at least about 5 
amino acids but can be as small as 3-4 amino acids. A T-cell epitope, such as a CTL 
epitope, will include at least about 7-9 amino acids, and a helper T-cell epitope at least 
about 12-20 amino acids. Normally, an epitope will include between about 7 and 15 

10 amino acids, such as, 9, 10, 12 or 15 amino acids. The term "antigen" denotes both 
subunit antigens, (i.e., antigens which are separate and discrete from a whole organism 
with which the antigen is associated in nature), as well as, killed, attenuated or 
inactivated bacteria, viruses, fungi, parasites or other microbes. Antibodies such as 
anti-idiotype antibodies, or fragments thereof, and synthetic peptide mimotopes, which 

15 can mimic an antigen or antigenic determinant, are also captured under the definition 
of antigen as used herein. Similarly, an oligonucleotide or polynucleotide which 
expresses an antigen or antigenic determinant in vivOy such as in gene therapy and 
DNA immunization applications, is also included in the definition of antigen herein. 
For purposes of the present invention, antigens can be derived from any of 

20 several known viruses, bacteria, parasites and fungi, as described more fully below. 

The term also intends any of the various tumor antigens. Furthermore, for purposes of 
the present invention, an "antigen" refers to a protein which includes modifications, 
such as deletions, additions and substitutions (generally conservative in nature), to the 
native sequence, so long as the protein maintains the ability to elicit an immunological 

25 response, as defined herein. These modifications may be deliberate, as through site- 
directed mutagenesis, or may be accidental, such as through mutations of hosts which 
produce the antigens. 

An "immunological response" to an antigen or composition is the development 
ia a subject of a humoral and/or a cellular immune response to an antigen present m the 

30 composition of interest. For purposes of the present invention, a "humoral immune 
response" refers to an immune response mediated by antibody molecules, while a 
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"cellular immune response" is one mediated by T-lymphocytes and/or other white . 
blood cells. One important aspect of cellular inmmnity involves an antigen-specific 
response by cytolytic T-cells ("CTL"s). CTLs have specificity for peptide antigens 
that are presented in association with proteins encoded by the major histocompatibility 
S complex (MHC) and expressed on the surfaces of cells. CTLs help induce and 
promote the destruction of intracellular microbes, or the lysis of cells infected with 
such microbes. Another aspect of cellular immunity involves an antigen-specific 
response by helper T-cells. Helper T-cells act to help stimulate the function, and focus 
the activity of, nonspecific effector cells against cells displaying peptide antigens in 
10 association with MHC molecules on their surface. A "cellular immune response" also 
refers to the production of cytokines, chemokines and other such molecules produced 
by activated T-cells and/or other white blood cells, including those derived from CD4+ 
and CD8+ T-cells. 

A composition or vaccine that elicits a cellular immune response may serve to 

1 5 sensitize a vertebrate subject by the presentation of antigen in association with MHC 
molecules at the cell surface. The cell-mediated immune response is directed at, or 
near, cells presenting antigen at their surface. In addition, antigen-specific T- 
lymphocytes can be generated to allow for the future protection of an unmunized host. 
The ability of a particular antigen to stimulate a cell-mediated immunological 

20 response may be determined by a number of assays, such as by lynq)hoproliferation 
(lymphocyte activation) assays, CTL cytotoxic cell assays, or by assaying for T- 
lymphocytes specific for the antigen in a sensitized subject. Such assays are well 
known in the art. See, e.g., Erickson et al., 7. Immunol (1993) 151:4189-4199; Doe 
et al., Eur. /. Immunol (1994) 24:2369-2376. Recent methods of measuring ceD- 

25 mediated inunune response include measurement of intracellular cytokines or cytokine 
secretion by T-ceU populations, or by measurement of epitope specific T-cefls (e.g., by 
the tetramer technique)(reviewed by McMichael, A.J., and O'Callaghan, C.A., /. Exp, 
Med, 187(9)1367-1371, 1998; Mcheyzer-Williams, M.G., et al, Immunol Rev. 150:5- 
21, 1996; Lalvani, A., et al. J. Exp, Med. 186:859-865, 1997). 

30 Thus, an immunological response as used herein may be one which stimulates 

the production of CTLs, and/or the production or activation of helper T- cells. The 
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antigen of interest may also elicit an antibody-mediated immune response. Hence, an 
innnunological response may include one or more of the following effects: the 
production of antibodies by B-cells; and/or the activation of suppressor T-cells and/or 
y5 T-cells directed specifically to an antigen or antigens present in the composition or 

5 vaccine of interest. These responses may serve to neutralize infectivity, and/or mediate 
antibody-complement, or antibody dependent cell cytotoxicity (ADCC) to provide 
protection to an immunized host. Such responses can be determined using standard 
immunoassays and neutralization assays, well known in the art. 

An "immunogenic conq)Osition*' is a conq)osition that conq>rises an antigenic 

10 molecule where administration of the composition to a subject results in the 

development in the subject of a humoral and/or a ceDular immune response to the 
antigenic molecule of interest. The immunogenic composition can be introduced 
directly into a recipient subject, such as by injection, inhalation, oral, intranasal and 
mucosal (e.g., intra-rectally or intra-vaginally) adnmnistration. 

15 By "subunit vaccine" is meant a vaccine composition which includes one or 

more selected antigens but not all antigens, derived from or homologous to, an antigen 
from a pathogen of interest such as from a virus, bacterium, parasite or fungus. Such a 
composition is substantially free of intact pathogen cells or pathogenic particles, or the 
lysate of such cells or particles. Thus, a "subunit vaccine" can be prepared from at 

20 least partially purified (preferably substantially purified) immunogenic polypeptides 

from the pathogen, or analogs thereof. The method of obtaining an antigen included in 
the subunit vaccine can thus include standard purification techniques, recombinant 
production, or synthetic production. 

"Substantially purified" general refers to isolation of a substance (compound, 

25 polynucleotide, protein, polypeptide, polypeptide con^osition) such that the substance 
comprises the majority percent of the sanqjle in which it resides. Typically in a sample 
a substantially purified component comprises 50%, preferably 80%-85%, more 
preferably 90-95% of the sample. Techniques for purifying polynucleotides and 
polypeptides of interest are well-known in the art and include, for example, ion- 

30 exchange chromatography, affinity chromatography and sedimentation according to 
density. 
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A "coding sequence" or a sequence which "encodes" a selected polypeptide, is 
a nucleic acid molecule which is transcribed (in the case of DNA) and translated (in the 
case of niRNA) into a polypeptide in vivo when placed under the control of 
appropriate regulatory sequences (or "control elenaents"). The boundaries of the 

S coding sequence are determined by a start codon at the 5' (amino) termmus and a 
translation stop codon at the 3' (carboxy) terminus. A coding sequence can include, 
but is not limited to, cDNA from viral, procaryotic or eucaryotic mRNA, genomic 
DNA sequences from viral or procaryotic DNA, and even synthetic DNA sequences. 
A transcription termination sequence such as a stop codon may be located 3' to the 

10 coding sequence. 

Typical "control elements", include, but are not limited to, transcription 
promoters, transcription enhancer elements, transcription termination signals, 
polyadenylation sequences (located 3' to the translation stop codon), sequences for 
optimization of initiation of translation (located 5' to the coding sequence), and 

15 translation termination sequences. For exan:q)Ie, the sequences and/or vectors 

described herein may also include one or more additional sequences that may optimize 
translation and/or termination including, but not limited to, a Kozak sequence (e.g., 
GCCACC, nucleotides 1 to 6 of SEQ ID NO: 191) placed in front (5') of the ATG of 
the codon-optimized wild-type leader or any other suitable leader sequence (e.g., tpal, 

20 tpa2, wtLnat (native wild-type leader)) or a termination sequence (e.g., TAA or, 

preferably, TAAA, nucleotides 1978 to 1981 of SEQ ID N0:191) placed after (3') the 
coding sequence. 

A ''polynucleotide coding sequence'* or a sequence which "encodes" a selected 
polypeptide, is a nucleic acid molecule which is transcribed (in the case of DNA) and 

25 translated (in the case of nnRNA) into a polypeptide in vivo when placed under the 
control of appropriate regulatory sequences (or "control elements"). The boundaries 
of the coding sequence are determined by a start codon, for exanqsle, at or near the 5* 
terminus and a translation stop codon, for exan^le, at or near the 3' terminus. 
Exemplary coding sequences are the modified viral polypeptide-coding sequences of 

30 the present invention. The coding regions of the polynucleotide sequences of the 
present invention are identifiable by one of skill in the art and may, for exan^)le, be 
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easily identified by performing translations of all three frames of the polynucleotide and 
identifying the frame corresponding to the encoded polypeptide, for example, a 
synthetic nef polynucleotide of the present invention encodes a nef-derived 
polypeptide. A transcription termination sequence may be located 3' to the codmg 
5 sequence. Typical "control elements", include, but are not limited to, transcription 
regulators, such as promoters, transcription enhancer elements, transcription 
termination signals, and polyadenylation sequences; and translation regulators, such as 
sequences for optimization of initiation of translation, e.g., Shine-Dalgamo (ribosome 
binding site) sequences, Kozak sequences (i.e., sequences for the optimization of 

10 translation, located, for example, 5' to the coding sequence), leader sequences, 

translation initiation codon (e.g., ATG), and translation termination sequences. In 
certain embodiments, one or more translation regulation or initiation sequences (e.g., 
the leader sequence) are derived from wild-type translation initiation sequences, i.e.^ 
sequences that regulate translation of the coding region m their native state. Wild-type 

1 5 leader sequences that have been modified, using the methods described herein, also 
find use in the present invention. Promoters can include inducible promoters (where 
expression of a polynucleotide sequence operably linked to the promoter is induced by 
an analyte, cofactor, regulatory protein, etc.), repressible promoters (where expression 
of a polynucleotide sequence operably linked to the promoter is induced by an analyte, 

20 cofactor, regulatory protein, etc.), and constitutive promoters. 

A "nucleic acid" molecule can include, but is not limited to, procaryotic 
sequences, eucaryotic mRNA cDNA from eucaryotic mRNA, genomic DNA 
sequences from eucaryotic (e.g., mammalian) DNA, and even synthetic DNA 
sequences. The term also captures sequences that include any of the known base 

25 analogs of DNA and RNA 

''Operably linked" refers to an arrangement of elements wherein the 
components so described are configured so as to perform their usual function. Thus, a 
given promoter operably linked to a coding sequence is capable of effecting the 
expression of the coding sequence when the proper enzymes are present. The 

30 promoter need not be contiguous with the coding sequence, so long as it functions to 
direct the expression thereof. Thus, for example, intervening untranslated yet 
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transcribed sequences can be present between the promoter sequence and the coding 
sequence and the promoter sequence can still be considered "operably linked" to the 
coding sequence. 

"Recombinant*' as used herein to describe a nucleic acid molecule means a 
5 polynucleotide of genomic, cDNA, semisynthetic, or synthetic origin which, by virtue 
of its origin or manipulation: (1) is not associated with all or a portion of the 
polynucleotide with which it is associated in nature; and/or (2) is linked to a 
polynucleotide other than that to which it is linked in nature. The term ^Recombinant" 
as used with respect to a protein or polypeptide means a polypeptide produced by 

10 expression of a reconabinant polynucleotide. ''Recombinant host cells," **host cells," 
"cells," "cell lines," "cell cultures," and other such terms denoting procaiyotic 
microorganisms or eucaryotic cell lines cultured as unicellular entities, are used inter- 
changeably, and refer to cells which can be, or have been, used as recipients for 
recombinant vectors or other transfer DNA, and include the progeny of the original 

1 5 cell which has been transfected. It is understood that the progeny of a single parental 
cell may not necessarily be completely identical in morphology or in genomic or total 
DNA complement to the original parent, due to accidental or deliberate mutation. 
Progeny of the parental cell which are sufficiently similar to the parent to be 
characterized by the relevant property, such as the presence of a nucleotide sequence 

20 encoding a desired peptide, are included in the progeny intended by this definition, and 
are covered by the above terms. 

Techniques for determining amino acid sequence "similarity" are well known in 
the art. In general, "similarity" means the exact amino acid to amino acid comparison 
of two or more polypeptides at the appropriate place, where amino acids are identical 

25 or possess similar chemical and/or physical properties such as charge or 

hydrophobicity. A so-termed "percent similarity* then can be determined between the 
compared polypeptide sequences. Techniques for determining nucleic acid and amino 
. acid sequence identity also are well known in the art and include determining the 
nucleotide sequence of the mRNA for that gene (usually via a cDNA intermediate) and 

30 determining the amino acid sequence encoded thereby, and comparing this to a second 
amino acid sequence. In general, "identity" refers to an exact nucleotide to nucleotide 
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or amino acid to amino acid correspondence of two polynucleotides or polypeptide 
sequences, respectively. 

Two or more polynucleotide sequences can be compared by determining their 
"percent identity." Two or more anoino acid sequences likewise can be compared by 

5 determining their "percent identity." The percent identity of two sequences, whether 
nucleic acid or peptide sequences, is generally described as the number of exact 
matches between two aligned sequences divided by the length of the shorter sequence 
and multiplied by 100. An approximate alignment for nucleic acid sequences is 
provided by the local homology algorithm of Smith and Waterman, Advances in 

1 0 Applied Mathematics 2:482-489 (1981). This algorithm can be extended to use with 
peptide sequences using the scoring matrix developed by Dayhoff, Atlas of Protein 
Sequences and Stmcture, M.O. Dayhoff ed., S suppl 3:353-358, National Biomedical 
Research Foundation, Washington, D.C., USA, and normalized by Gribskov, NucL 
Acids Res. 14(6):6745-6763 (1986). An implementation of this algorithm for nucleic 

15 acid and peptide sequences is provided by the Genetics Con5)uter Group (Madison, 
WI) in their BestFit utility application. The default parameters for this method are 
described in the Wisconsin Sequence Analysis Package Program Manual, Version 8 
(1995) (available from Genetics Computer Group, Madison, WI). Other equally 
suitable programs for calculating the percent identity or similarity between sequences 

20 are generally known in the art. 

For example, percent identity of a particular nucleotide sequence to a reference 
sequence can be determined usmg the homology algorithm of Smith and Waterman 
with a default scoring table and a gap penalty of six nucleotide positions. Another 
method of establishing percent identity in the context of the present invention is to use 

25 the MPSRCH package of programs copyrighted by the University of Edinburgh, 

developed by John F. Collins and Shane S. Sturrok, and distributed by IntelliGenetics, 
Inc. (Mountain View, CA). From this suite of packages, the Smith-Watemnan 
algorithm can be employed where default parameters are used for the scoring table (for 
example, gap open penalty of 12, gap extension penalty of one, and a gap of six). 

30 From the data generated, the "Match" value reflects "sequence identity." Other 

suitable programs for calculating the percent identity or similarity between sequences 
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are generally known in the art, such as the alignment program BLAST, which can also 
be used with default parameters. For example, BLASTN and BLASTP can be used 
with the following default parameters: genetic code = standard; filter = none; strand = 
both; cutoff = 60; expect = 10; Matrbc = BLOSUM62; Descriptions ^ SO sequences; 
5 sort by = HIGH SCORE; Databases = non-redundant, GenBank + EMBL + DDBJ + 
PDB + GenBank CDS translations + Swiss protein + Spupdate + PIR. Details of these 
programs can be found at the following internet address: http://www.ncbi.nliagov/cgi- 
bin/BLAST. 

One of skill in the art can readily determine the proper search parameters to use 

10 for a given sequence, exen^lary preferred Smith Waterman based parameters are 
presented above. For example, the search parameters may vary based on the size of 
the sequence in question. Thus, for the polynucleotide sequences of the present 
invention the length of the polynucleotide sequence disclosed herein is searched against 
a selected database and compared to sequences of essentially the same length to 

1 5 determine percent identity. For example, a representative embodiment of the present 
invention would include an isolated polynucleotide comprising X contiguous 
nucleotides, wherein (i) the X contiguous nucleotides have at least about a selected 
level of percent identity relative to Y contiguous nucleotides of one or more of the 
sequences described herein (e.g., in Table Q or jfragment thereof, and (ii) for search 

20 purposes X equals Y, wherein Y is a selected reference polynucleotide of defined 
length (for example, a length of from 15 nucleotides up to the number of nucleotides 
present in a selected full-length sequence). 

The sequences of the present invention can include fragments of the sequences, 
for example, from about 15 nucleotides up to the number of nucleotides present in the 

25 fiill-length sequences described herein (e.g., see the Figures), including all integer 
values falling within the above-described range. For example, fragments of the 
polynucleotide sequences of the present invention may be 30-60 nucleotides, 60-120 
nucleotides, 120-240 nucleotides, 240-480 nucleotides, 480-1000 nucleotides, and all 
integer values therebetween. 

30 The synthetic expression cassettes (and purified polynucleotides) of the present 

invention include related polynucleotide sequences having about 80% to 100%, greater 
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than 80-85%, preferably greater than 90-92%, more preferably greater than 95%, and 
most preferably greater than 98% up to 100% (including all integer values falling 
within these described ranges) sequence identity to the synthetic expression cassette 
and/or polynucleotide sequences disclosed herem (for exsraph, to the sequences of 
5 the present invention) when the sequences of the present invention are used as the 
query sequence against, for example, a database of sequences. 

Two nucleic acid fragments are considered to "selectively hybridize" as 
described herein. The degree of sequence identity between two nucleic acid molecules 
affects the efficiency and strength of hybridization events between such molecules. A 
10 partially identical nucleic acid sequence will at least partially inhibit a conq)letely 

identical sequence from hybridizing to a target molecule. Inhibition of hybridization of 
the completely identical sequence can be assessed using hybridization assays that are 
well known in the art (e.g.. Southern blot, Northern blot, solution hybridization, or the 
like, see Sambrook, et al, supra or Ausubel et al, supra). Such assays can be 
15 conducted using varying degrees of selectivity, for example, using conditions varying 
from low to high stringency. If conditions of low stringency are employed, the 
absence of non-specific binding can be assessed using a secondary probe that lacks 
even a partial degree of sequence identity (for example, a probe having less than about 
30% sequence identity with the target molecule), such that, in the absence of non- 
20 specific binding events, the secondary probe will not hybridize to the target. 

When utilizing a hybridization-based detection system, a nucleic acid probe is 
chosen that is complementary to a target nucleic acid sequence, and then by selection 
of appropriate conditions the probe and the target sequence "selectively hybridize," or 
bind, to each other to form a hybrid molecule. A nucleic acid molecule that is capable 
25 of hybridizing selectively to a target sequence under 'Moderately stringent" typically 
hybridizes under conditions that allow detection of a target nucleic acid sequence of at 
least about 10-14 nucleotides in length having at least approximately 70% sequence 
identity with the sequence of the selected nucleic acid probe. Strmgent hybridization 
conditions typically allow detection of target nucleic acid sequences of at least about 
30 i 0- 14 nucleotides in length having a sequence identity of greater than about 90-95% 
with the sequence of the selected nucleic acid probe. Hybridization conditions useful 
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for probe/target hybridization where the probe and target have a specific degree of 
sequence identity, can be determined as is known in the art (see, for example, Nucleic 
Acid Hybridization: A Practical Approach , editors B.D. Hanies and S J. Higgins, 
(1985) Oxford; Washington, DC; IRL Press). 
S With respect to stringency conditions for hybridization, it is well known in the 

art that numerous equivalent conditions can be en^loyed to establish a particular 
stringency by varying, for example, the following factors: the length and nature of 
probe and target sequences, base composition of the various sequences, concentrations 
of salts and other hybridization solution components, the presence or absence of 

10 blocking agents in the hybridization solutions (e.g., formamide, dextran sulfate, and 
polyethylene glycol), hybridization reaction temperature and time parameters, as well 
as, varying wash conditions. The selection of a particular set of hybridization 
conditions is selected following standard methods m the art (see, for exan[q)le, 
Sambrook, et al., supra or Ausubel et al., supra). 

15 A first polynucleotide is "derived fi*om" second polynucleotide if it has the 

same or substantially the same basepair sequence as a region of the second 
polynucleotide, its cDNA, complements thereof, or if it displays sequence identity as 
described'above. 

A first polypeptide is "derived from" a second polypeptide if it is (i) encoded by 
20 a first polynucleotide derived from a second polynucleotide, or (ii) displays sequence 
identity to the second polypeptides as described above. 

Generally, a viral polypeptide is "derived from" a particular polypeptid[e of a 
virus (viral polypeptide) if it is (i) encoded by an open reading frame of a 
polynucleotide of that virus (viral polynucleotide), or (ii) displays sequence identity to 
25 polypeptides of that virus as described above. 

"Encoded by" refers to a nucleic acid sequence which codes for a polypeptide 
sequence, wherein the polypeptide sequence or a portion thereof contains an amino 
acid sequence of at least 3 to 5 amino acids, more preferably at least 8 to 10 amino 
acids, and even more preferably at least 15 to 20 amino acids from a polypeptide 
30 encoded by the nucleic acid sequence. Also encoiiq)assed are polypeptide sequences 
which are immunologically identifiable with a polypeptide encoded by the sequence. 
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Further, polyproteins can be constructed by fusing in-frame two or more 
polynucleotide sequences encoding polypeptide or peptide products. Further, 
polycistronic coding sequences may be produced by placing two or more 
polynucleotide sequences encoding polypeptide products adjacent each other, typically 
5 under the control of one promoter, wherein each polypeptide coding sequence may be 
modified to include sequences for internal ribosome binding sites. 

"Purified polynucleotide" refers to a polynucleotide of mterest or fragment 
thereof which is essentially free, e.g., contains less than about 50%, preferably less 
than about 70%, and more preferably less than about 90%, of the protein with which 

10 the polynucleotide is naturally associated. Techniques for purifymg polynucleotides of 
interest are well-known in the art and include, for example, disruption of the cell 
containing the polynucleotide with a chaotropic agent and separation of the 
polynucleotide(s) and proteins by ion-exchange chromatography, affinity 
chromatography and sedimentation according to density. 

1 5 By "nucleic acid immunization" is meant the introduction of a nucleic acid 

molecule encoding one or more selected antigens into a host cell, for the in vivo 
expression of an antigen, antigens, an epitope, or epitopes. The nucleic acid molecule 
can be mtroduced directly into a recipient subject, such as by injection, inhalation, oral, 
intranasal and mucosal administration, or the like, or can be introduced ex vivo, into 

20 cells which have been removed from the host. In the latter case, the transformed cells 
are reintroduced into the subject where an immune response can be mounted against 
the antigen encoded by the nucleic acid molecule. 

"Gene transfer" or "gene delivery" refers to methods or systems for reliably 
inserting DNA of interest into a host cell. Such methods can result in transient 

25 expression of non-integrated transferred DNA, extrachronwjsomal replication and 
expression of transferred replicons (e.g., episomes), or integration of transferred 
genetic material into the genomic DNA of host cells. Gene delivery e;qpression vectors 
include, but are not limited to, vectors derived from alphaviruses, pox viruses and 
vaccinia viruses. When used for immunization, such gene delivery expression vectors 

30 may be referred to as vaccines or vaccine vectors. 
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*T lymphocytes" or cells'* are non-antibody producing lymphocytes that 
constitute a part of the cell-mediated arm of the immune system. T cells arise from 
immature lymphocytes that migrate from the bone marrow to the thymus, where they 
undergo a maturation process under the direction of thymic hormones. Here, the 
5 mature lymphocytes rapidly divide increasing to very large numbers. The maturing T 
ceUs become infununocompetent based on their ability to recognize and bind a specific 
antigen. Activation of immunoconq)etent T cells is triggered when an antigen binds to 
the lymphocyte's surface receptors. 

The term '"transfection" is used to refer to the uptake of foreign DNA by a cell. 

10 A cell has been 'transfected" when exogenous DNA has been introduced inside the cell 
membrane. A number of transfection techniques are generally known in the art. See, 
e.g., Graham et al. (1973) Virology, 52:456, Sambrook et al. (1989) Molecular 
Cloning, a laboratory manual. Cold Sprmg Harbor Laboratories, New York, Davis et 
al. (1986) Basic Methods in Molecular Biology, Elsevier, and Chu et al. (1981) Gene 

15 13: 197. Such techniques can be used to introduce one or more exogenous DNA 

moieties into suitable host cells. The term refers to both stable and transient uptake of 
the genetic material, and includes uptake of peptide- or antibody-linked DNAs. 

A "vector" is capable of transferring gene sequences to target cells (e.g., viral 
vectors, non-viral vectors, particulate carriers, and liposomes). Typically, "vector 

20 construct," "expression vector," and "gene transfer vector," mean any nucleic acid 
construct capable of directing the expression of a gene of interest and which can 
transfer gene sequences to target cells. Thus, the term includes cloning and expression 
vehicles, as well as viral vectors. 

Transfer of a "suicide gene" (e.g., a drug-susceptibility gene) to a target cell 

25 renders the cell sensitive to compounds or compositions that are relatively nontoxic to 
normal cells. Moolten, F.L. (1994) Cemcer Gene Then 1:279-287. Examples of 
suicide genes are thymidine kinase of herpes simplex virus (HSV-tk), cytochrome 
P450 (Manome et al. (1996) Gene Therapy 3:513-520), human deoxycytidine kinase 
(Manome et al. (1996) Nature Medicine 2(5):567-573) and the bacterial enzyme 

30 cytosine deaminase (Dong et al. (1996) Hutnan Gene Therapy 2:713-720). Cells 
which express these genes are rendered sensitive to the effects of the relatively 
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nontoxic prodrugs ganciclovir (HSV-tk), cyclophosphamide (cytochrome P450 2B1), 
cytosine arabinoside (human deoxycytidine kinase) or 5-fluorocytosine (bacterial 
cytosine deaminase). Culver et al. (1992) Science 256:1550-1552. Ruber et al. (1994) 
Proc. Natl Acad. ScL USA 21:8302-8306. 
5 A "selectable marker" or ''reporter maricer** refers to a nucleotide sequence 

included in a gene transfer vector that has no therapeutic activity, but rather is included 
to allow for simpler preparation, manufacturing, characterization or testing of the gene 
transfer vector. 

A "specific bindmg agent" refers to a member of a specific binding pair of 
10 molecules wherein one of the molecules specifically binds to the second molecule 

through chemical and/or physical means. One example of a specific binding agent is an 

antibody directed against a selected antigen. 

By "subject" is meant any member of the subphylum chordata, including, 

without limitation', humans and other prfanates, including non-himtian primates such as 
15 rhesus macaque, chimpanzees and other apes and monkey species; farm animals such 

as cattle, sheep, pigs, goats and horses; domestic mammals such as dogs and cats; 

laboratory animals including rodents such as mice, rats and guinea pigs; birds, 

including domestic, wild and game birds such as chickens, turkeys and other 

gallinaceous birds, ducks, geese, and the like. The term does not denote a particular 
20 age. Thus, both adult and newborn individuals are intended to be covered. The 

system described above is intended for use in any of the above vertebrate species, since 

the immune systems of all of these vertebrates operate similarly. 

By "pharmaceutically acceptable" or "pharmacologically acceptable" is meant a 

material which is not biologically or otherwise undesirable, i.e., the material may be 
25 administered to an individual in a formulation or composition without causing any 

undesirable biological effects or interacting in a deleterious manner with any of the 

components of the conq)osition in which it is contained. 

By '^physiological pH" or a "pH in the physiological range" is meant a pH in 

the range of approximately 7.0 to 8.0 inclusive, more typically in the range of 
30 approximately 7.2 to 7.6 inclusive. 
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As used herein, 'treatment" refers to any of (i) the prevention of infection or 
reinfection, as in a traditional vaccine, (ii) the reduction or elimination of symptonos, 
and (iii) the substantial or complete elimination of the pathogen in question. Treatment 
may be effected prophylactically (prior to infection) or therapeutically (following 
5 infection). 

By "co-administration" is meant administration of more than one composition 
or molecule. Thus, co-administration includes concurrent administration or 
sequentially administration (in any order), via the same or different routes of 
administration. Non-limiting examples of co-administration regimes include, co- 

10 administration of nucleic acid and polypeptide; co-administration of different nucleic 
acids (e.g., different expression cassettes as described herein and/or different gene 
delivery vectors); and co-administration of different polypeptides (e.g., different HIV 
polypeptides and/or different adjuvants). The term also encompasses multiple 
administrations of one of the co-administered molecules or compositions {e,g., multiple 

15 administrations of one or more of the expression cassettes described herein followed 
by one or more administrations of a polypeptide-containing composition). In cases 
where the molecules or compositions are delivered sequentially, the time between each 
administration can be readily determined by one of skill in the art in view of the 
teachings herein. • 

20 "Lentiviral vector", and "recombinant lentiviral vector" refer to a nucleic acid 

construct which carries, and within certain embodiments, is capable of directing the 
expression of a nucleic acid molecule of mterest. The lentiviral vector include at least 
one transcriptional promoter/enhancer or locus defining element(s), or other elements 
which control gene expression by other means such as alternate splicing, nuclear RNA 

25 export, post-translational modification of messenger, or post-transcriptional 

niodification of protein. Such vector constructs nmst also include a packaging signal, 
long terminal repeats (LTRS) or portion thereof, and positive and negative strand 
primer binding sites appropriate to the retrovirus used (if these are not already present 
in the retroviral vector). Optionally, the recombinant lentiviral vector may also include 

30 a signal which directs polyadenylation, selectable markers such as Neo, TK, 

hygromycin, phleoraycin, histidmol, or DHFR, as weU as one or more restriction sites 
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and a translation terniination sequence. By way of example, such vectors typically 
include a 5' LTR, a tRNA binding site, a packaging signal, an origin of second strand 
DN A synthesis, and a Sl-TR or a portion thereof 

"Lentiviral vector particle" as utilized within the present invention refers to a 

S lentivirus which carries at least one gene of interest. The retrovirus may also contain a 
selectable marker. The recombinant lentivirus is capable of reverse transcribing its 
genetic material (RNA) into DNA and incorporating this genetic material into a host 
cell's DNA upon infection. Lentiviral vector particles may have a lentiviral envelope, a 
non-lentiviral envelope (e.g., an ampho or VSV-G envelope), or a chimeric envelope. 

1 0 "Nucleic acid expression vector" or '^Expression cassette" refers to an assennbly 

which is capable of directing the expression of a sequence or gene of interest. The 
nucleic acid expression vector includes a promoter which is operably linked to the 
sequences or gene(s) of interest. Other control elements may be present as well. 
Expression cassettes described herein may be contained within a plasmid construct. In 

15 addition to the components of the expression cassette, the plasmid construct may also 
include a bacterial origin of replication, one or more selectable markers, a signal which 
allows the plasmid construct to exist as single-stranded DNA (e.g., a M13 origin of 
replication), a multiple cloning site, and a "mammalian" origin of replication (e.g., a 
S V40 or adenovirus origin of replication). 

20 "Packaging cell" refers to a cell which contains those elements necessary for 

production of infectious recombinant retrovirus which are lacking in a recombinant 
retroviral vector. Typically, such packaging cells contain one or more expression 
cassettes which are capable of expressmg proteins which encode Gag, pol and env 
proteins. 

25 "Producer cell" or "vector producing cell" refers to a cell which contains aU 

elements necessary for production of recombinant retroviral vector particles. 

2. Modes OF Carrying Out THE Invention 

Before describing the present invention in detail, it is to be understood that this 
30 invention is not limited to particular formulations or process parameters as such may, 
of course, vary. It is also to be understood that the terminology used herein is for the 
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purpose of describing particular embodinients of the invention only, and is not intended 
to be limiting. 

Although a number of methods and materials similar or equivalent to those 
described herein can be used in the practice of the present invention, the preferred 
5 materials and methods are described herein, 

2.1.0. The fflV Genome 

The HTV genome and various polypeptide-encoding regions are shown in Table 
A. The nucleotide positions are given relative to 8_5_TV1_C.ZA (Figure 1; an HTV 

10 Type C isolate). However, it wiD be readily apparent to one of ordinary skill in the art 
in view of the teachings of the present disclosure how to determine corresponding 
regions in other HIV strains or variants (e.g., isolates HTVnn,, HIVspj, HIV-1sfi62» 
HIV-lsFno> HIVlav> HTVlai, HIVmn, HIV-1cm235» HIV-1us4, other HIV-1 strains from 
diverse subtypes(e.g., subtypes, A through G, and O), HrV-2 strains and diverse 

15 subtypes (e.g., HIV-2uci and HrV-2uc2), and simian immunodeficiency vims (SIV). 
(See, e.g., Virology, 3rd Edition (W.K. Joklik ed. 1988); Fundamental Virology, 2nd 
Edition (B.N. Fields and D.M. Knipe, eds. 1991); Virology, 3rd Edition (Fields, BN, 
DM Knipe, PM Howley, Editors, 1996, Lippincott-Raven, Philadelphia, PA; for a 
description of these and other related vimses), using for exan^le, sequence 

20 comparison programs {e.g., BLAST and others described herein) or identification and 
alignment of structural features (e.g„ a program such as the "ALB" program described 
herein that can identify the various regions). 



Table A: Regions of the HIV Genome relative to 8.5_TV1_C.ZA 



Region 


Position in nucleotide sequence 


5'LTR 


1-636 


U3 


1-457 


R 


458-553 


U5 


554-636 


NFkBn 


340-348 


NFkBI 


354-362 


Spl m 


379-388 


spi n 


390-398 
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SplI 

TATA Box 
TAR 

Poly A signal 

5 

PBS 

p7 binding region, packaging signal 

10 Gag: 
pl7 

p24 

Cyclophilin A bdg. 
MHR 
15 p2 
P7 

Frameshift slip 
pl 

p6Gag 
20 Zn-motifl 
Zn-motifn 

Pol: 
p6Pol 
25 Prot 

p66RT 

plSRNaseH 

p31Int 

30 Vif: 

Hydrophilic region 

Vpr: 

Oligomerization 
35 Amphipathic a-helix 

Tat: 

Tat-1 exon 
Tat-2 exon 
40 N-terminal domain 



400-410 
429-433 
474-499 
529-534 

638-655 

685-791 

792-2285 

792-1178 

1179-1871 

1395-1505 

1632-1694 

1872-1907 

1908-2072 

2072- 2078 

2073- 2120 
2121-2285 
1950-1991 
2013-2054 

2072-5086 
2072-2245 
2246-2542 
2543-4210 
3857-4210 
4211-5086 

5034-5612 
5292-5315 

5552-5839 

5552-5677 
5597-5653 

5823-6038 and 8417-8509 

5823-6038 

8417-8509 
5823-5885 
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Trans-activation domain 
Transduction domain 

Rev: 
5 Rev-1 exon 
Rev-2 exon 
High-affinity bdg. site 
Leu-rich effector domain 

10 Vpu: 

Transmembrane domain 
Cytoplasmic domain 

Env (gpl60): 
15 Signal peptide 
gpl20 
VI 
V2 
V3 

20 V4 
V5 
CI 

C2 
C3 

25 C4 
C5 

CD4 binding 

gp41 

Fusion peptide 
30 Oligomerization domain 
N-terminal heptad repeat 
C-terminal heptad repeat 
Immunodominant region 

35 Nef: 

Myristoylation 
SH3 binding 
Polypurine tract 
SH3 binding 

40 
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5886-5933 

5961- 5993 

5962- (037 and 8416-8663 

5962-6037 
8416-8663 
8439-8486 
8562-8588 

6060-6326 

6060-6161 
6162-6326 

6244-8853 
6244-6324 
6325-7794 

6628-6729 
6727-6852 
7150-7254 
7411-7506 
7663-7674 
6325-6627 
6853-7149 
7255-7410 
7507-7662 
7675-7794 
7540-7566 
7795-8853 
7789-7842 
7924-7959 
7921-8028 
8173-8280 
8023-8076 

8855-9478 
8858-8875 
9062-9091 

9128-9154 
9296-9307 
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It will be readily apparent that one of skill in the art can readily align any 
sequence to that shown in Table A to determine relative locations of any particular 
mV gene. For example, using one of the alignment programs described herein {e,g., 
BLAST), other HIV genonomic sequences can be aligned with 8_5_TV1_C.ZA (Table 
5 A) and locations of genes determined. Polypeptide sequences can be similarly aligned. 
For example, Figures 2A-2C shows the alignment of Env polypeptide sequences from 
various strains, relative to SF-162. As described in detaU in co-owned WO/39303, 
Env polypeptides (e.g., gpl20, gpl40 and gpl60) include a •'bridging sheet" 
comprised of 4 anti-parallel P-strands (p-2, p-3, p-20 and P-21) that form a P-sheet. 

10 Extruding from one pair of the P-strands (p-2 and p-3) are two loops, VI and V2. The 
P-2 sheet occurs at approximately amino acid residue 113 (Cys) to amino acid residue 
1 17 (Thr) while P-3 occurs at approximately amino acid residue 192 (Ser) to amino 
acid residue 194 (He), relative to SF-162. The •*V1A^2 region" occurs at 
approximately amino acid positions 120 (Cys) to residue 189 (Cys), relative to SF-162. 

15 Extruding from the second pair of P-strands (P-20 and P-21) is a "small-loop" 

structure, also referred to herein as "the bridging sheet small loop." The locations of 
both the small loop and bridging sheet small loop can be determined relative to HXB-2 
following the teachings herein and in WO/39303. Also shown by arrows in Figure 
2A-C are approximate sites for deletions sequence from the beta sheet region. The "*" 

20 denotes N-glycosylation sites that can be mutated following the teachings of the 
present specification. 



2.1.1. Weld-Type fflV Sequences 

Isolated nucleotide sequences for various novel subtype C novel isolates are 
25 shown in Table Al below. Sequence were obtained and analyzed {eg., phylogenetic 
tree analysis) as described in Engelbrecht et al (2001) AIDS Res. Hum. Retroviruses 
17(16):1533-1547. (See, also, GenBank). Sequences of accessory proteins and 
analysis of these sequences is described in Scriba et al. (2001) AIDS Res. Hum. 
Retroviruses 17(8):775-781. 
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Table Al: Wild-Type Sequences 



Name 


SEQ 

roNO 


Figure 
Number 


Description 


£nvTV001c8.2 


61 


58 (2 sheets) 


complete Env sequence of clone 
TV001c8.2 of isolate C-98TV001 


£nvTV001c8.5 


62 


59 (2 sheets) 


con:5>lete Env sequence of clone 
TV001c8.5 of isolate C-98TV001 


£nvTV001cl2.1 

> 


63 


60 (2 sheets) 


coiq>lete Env sequence of clone 
TV001cl2.1 of isolate C-98TV002 


Env TV0O3cE260 


64 


61 (2 sheets) 


con^lete Env sequence of clone 
TV003cE260 of isolate C-98TV003 


Env TV004cC300 


65 


62 (2 
sheets) 


complete Env sequence of clone 
TV004cC300 of isolate C-98TV004 


£nvTV006c9.1 


66 


63 (2 
sheets) 


complete Env sequence of clone 
TV006c9.1 of isolate C-98TV006 


£nvTV006c9.2 


67 


64 (2 
sheets) 


complete Env sequence of clone 
TV006c9.2 of isolate C-98TV006 


£nvTV006cE9 


68 


65 (2 sheets) 


complete Env sequence of clone 
TV006cE9 of isolate C-98TV006 


£:nvTV007cB104 


69 


66 (2 sheets) 


complete Env sequence of clone 
TV007cB104 of isolate C-98TV007 


£nvTV007cB105 


70 


67 (2 sheets) 


complete Env sequence of clone 
TV007cB105 of isolate C-98TV007 


£/ivTV008c4.3 


71 


68 (2 sheets) 


complete Env sequence of clone 
TV008c4.3 of isolate C-98TV008 


£nvTV008c4.4 


72 


69 (2 sheets) 


complete Env sequence of clone 
TV008c4.4 of isolate C-98TV008 


EnvTVOlOcD? 


73 


70 (2 sheets) 


complete Env sequence of clone 
TV010cD7 of isolate C-98TV010 


£nvTV012c2.1 


74 


71 (2 sheets) 


complete Env sequence of clone 
TV012c2.1 of isolate C-98TV012 


Env TV012c2.2 


75 


72 (2 sheets) 


complete Env sequence of clone 
TV012c2.2 of isolate C-98TV012 


£»vTV013cB20 


76 


73 (2 sheets) 


complete Env sequence of clone 
TV013cB20 of isolate C-98TV013 
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Name 


SEQ 

roNO 


Figure 
Number 


Description 


£«vTV013cH17 


77 


74 (2 sheets) 


complete Env sequence of clone 
TV013cH17 of isolate C-98TV013 


£nvTV014c6.3 


78 


75 (2 sheets) 


complete Env sequence of clone 
TV014c6.3 of isolate C-98TV014 


£nvTV014c6.4 


79 


76 (2 sheets) 


complete Env sequence of clone 
TV014c6.4 of isolate C-98TV014 


£nvTV018cF1027 


80 


77 (2 sheets) 


complete Env sequence of clone 
TV018cF1027 of isolate C-98TV018 


£nvTV019c5 


81 


78 (2 sheets) 


conq>lete Env sequence of clone 
TV019c5 of isolate C-98TV019 


Gog TV001G8 


82 


79 


complete Gag sequence of clone 
TV001G8 of isolate C-98TV001 


Gag TVOOlGll 


83 


80 


conq)lete Gog sequence of clone 
TVOOlGll of isolate C-98TV001 


Gog TV002G8 


84 


81 


C0II^>lete Gag sequence of clone 
TV002G8 of isolate C-98TV002 


Gag TV0O3G15 


85 


82 


complete Gag sequence of clone 
TV003G15 of isolate C-98TV003 


Gag TV004G17 


86 


83 


complete Gag sequence of clone 
TV004G17 of isolate C-98TV004 


Gag TV004G24 


87 


84 


complete Gag sequence of clone 
TV004G24 of isolate C-98TV004 


Gag TV006G11 


88 


85 


complete Gag sequence of clone 
TV006G11 of isolate C-98TV006 


Gog TV006G97 


89 


86 


complete Gag sequence of clone 
TV006G97 of isolate C-98TV006 


Gag TV007G59 


90 


87 


complete Gag sequence of clone 
TV007G59 of isolate C-98TV009 


Gag TV008G65 


91 


88 


complete Gag sequence of clone 
TV008G65 of isolate C-98TV008 


Gag TV008666 


92 


89 


conq)lete Gag sequence of clone 
TV008G66 of isolate C-98TV008 
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SEQ 

roNO 


Figure 
Number 


DescriDtion 


Gag TV010G74 


93 


90 


comolete Ga£ seouence of clone 
TV010G74 of isolate C-98TV010 


Gag TV012G34 


94 


91 


con^lete Gag sequence of clone 
TV012G34 of isolate C-98TV012 


Gas TV012G40 


95 


92 


conrnlete Gclq seouence of clnnp 

TV012G40 of isolate C-98TV012 


Gag TV013G2 


96 


93 


comDlete Ga9 seouence of clone 
TV013G2 of isolate C-98TV013 


GagTV013G15 


97 


94 


complete Gag sequence of clone 
TV013G15 of isolate C-98TV013 


Gag TV014G73 


98 


95 


complete Gag sequence of clone 
TV014G73 of isolate C-98TV014 


GajeTV018G60 


99 


96 


comnletp G/iq seoiienre of rinne 

TV018G60 of isolate C-98TV018 


G£7P TV019G20 


100 


97 


ponrnlpff* Crno cpnnpnpp plr»nf* 

TV019G20 of isolate C-98TV019 


Gag TV019G25 


101 


98 


complete Gag sequence of clone 
TV019G25 of isolate C-98TV019 


8_2_TV1 LTR 


181 


102 

(2 sheets) 


sequence from the 3' region of the 

clone designated 8_2_TV1 


2^1/4_TV12_C_ZA 


182 


103 

(5 sheets) 


sequence of 2_1/4_TV12_Q_ZA 



2.2.0 Synthetic Expression Cassettes 

1 5 One aspect of the present invention is the generation of HIV-1 coding 

sequences, and related sequences, for exanq)le having improved expression relative to 
the corresponding wild-type sequences. 

2.2.1 Modification of mv-l Nucleic Acm Coding Sequences 

20 First, the HIV-1 codon usage pattern was modified so that the resulting nucleic 

acid coding sequence was comparable to codon usage found in highly expressed 
human genes. The HTV codon usage reflects a high content of the nucleotides A or T 
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of the codon-triplet. The effect of the HTV- 1 codon usage is a high AT content in the 
DI^ A sequence that results in a decreased translation ability and instability of the 
mRNA. In con^arison, highly expressed human codons prefer the nucleotides G or C. 
The HIV coding sequences were modified to be comparable to codon usage found in 
5 highly expressed human genes. 

Second, there are inhibitory (or instability) elements (INS) located within the 
coding sequences of, for example, the Gag coding sequences. The RRE is a secondary 
RNA structure that interacts with the HIV encoded Rev-protein to overcome the 
expression down-regulating effects of the INS. To overcome the post-transctiptional 
10 activating mechanisms of KEIE and Rev, the instability elements can be inactivated by 
introducing multiple point mutations that do not alter the reading frame of the encoded 
proteins. 

Third, for some genes the coding sequence has been altered such that the 
polynucleotide coding sequence encodes a gene product that is inactive or non- 
15 functional (e.g., inactivated polymerase, protease, tat, rev, nef, vif, vpr, and/or vpu 

gene products). Example 1 describes some exemplary mutations. Example 8 presents 
information concerning functional analysis of mutated Tat, Rev and Nef antigens. 

The synthetic codmg sequences are assembled by methods known in the art, for 
example by companies such as the Midland Certified Reagent Company (Midland, 
20 Texas). 

Modification of the Gag polypeptide coding sequences results in improved 
expression relative to the wDd-type coding sequences in a number of mammalian cell 
lines (as well as other types of cell lines, including, but not limited to, insect cells). 
Some exemplary polynucleotide sequences encoding Gag-containing 

25 polypeptides are GagCon5)lPolmut_C, GagConq)lPolmutAtt_C, 

GagComplPohnutIna_C, GagCon:5>lPolmutIhaTatRevNef_C, GagPohnut^C, 
GagPohnutAtt_C, GagPotaiutlna^C, GagProtlnaRTmut^C, 
GagProanaRTmutTatRevNef^C, GagRTmut^C, GagRTmutTatRevNef^C, 
GagTatRevNef_C, and gpl20mod.TVl.dell 18-210. 

30 Similarly, the present invention also includes synthetic Env-encoding 

polynucleotides and modified Env proteins, for example, gpl20mod.TVl.dell 18-210, 
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gpl20mod.TVl.delVlV2, gpl20mod.TVl.delV2, gpl40mod.TVl. dell 18-210, 
gpl40mod.TVl.delVlV2, gpl40mod.TVl.delV2, gpl40mod.TVl.mut7, 
gpl40mod.TVl.tpa2, gpl40TMmod.TVl, gpl60mod.TVl.delll 8-210, 
gpl60mod.TVl.delVlV2, gpl60mod.TVl.delV2, gpl6Qmod.TVl.dVl, 
5 gpl60mod.TVl.dVl-gagmod.BW965,gpl60mod.TVl.dVlV2-gagmod.BW965, 
gpl60mod.TVl.dV2-gagmod.BW965, gpl60mod.TVl,tpa2, and gpl60mod.TVl- 
gagmod.BW965. 

The codon usage pattern for Env was modified as described above for Gag so 

that the resulting nucleic acid coding sequence was comparable to codon usage found 
10 in highly expressed human genes. Experiments performed in support of the present 

invention show that the synthetic Env sequences were capable of higher level of 

protein production relative to the native Env sequences. 

Modification of the Env polypeptide coding sequences results in improved 

expression relative to the wild-type coding sequences in a number of mammalian cell 
15 lines (as well as other types of cell lines, including, but not limited to, insect cells). 

Similar Env polypeptide coding sequences can be obtained, modified and tested for 

improved expression from a variety of isolates, including those described above for 

Gag, 

Further modifications of Env include, but are not linaited to, generating 
20 polynucleotides that encode Env polypeptides having mutations and/or deletions 
therein. For instance, the hypervariable regions, VI and/or V2, can be deleted as 
described herein. Additionally, other modifications, for example to the bridging sheet 
region and/or to N-glycosylation sites within Env can also be performed following the 
teachings of the present specification, (see, Figure2A-C, as well as WO 00/39303, 
25 WO 00/39302, WO 00/39304, WO 02/04493). Various combinations of these 

modifications can be employed to generate synthetic expression cassettes as descnbed 
herein. 

The present invention also includes expression cassettes which include 
synthetic Pol sequences. As noted above, 'Tor* includes, but is not limited to, the 
30 protein-encoding regions conq)rising polymerase, protease, reverse transcriptase 

and/or integrase-containing sequences (Wan et et al (1996) Biochenu J. 316:569-573; 
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Kohl et al. (1988) PNAS USA 85:4686-4690; Krausslich et al. (1988) I Virol 
62:4393-4397; CofBn, **Retroviridae and their Replication" in Virology, ppl437-1500 
(Raven, New York, 1990); Patel et. aL (1995) Biochefnistry 34:5351-5363). Thus, the 
synthetic expression cassettes exemplified herein include one or more of these regions 
5 and one or more changes to the resulting amino acid sequences. Some exemplary 
polynucleotide sequences encoding Pol-derived polypeptides are presented in Table C. 

The codon usage pattern for Pol was modified as described above for Gag and 
Env so that the resulting nucleic acid coding sequence was comparable to codon usage 
found in highly expressed human genes. 

10 Constructs may be modified in various ways. For example, the expression 

constructs may include a sequence that encodes the first 6 amino acids of the integrase 
polypeptide. This 6 amino acid region is believed to provide a cleavage recognition 
site recognized by HIV protease (see, e.g., McCoraack et al. (1997) FEES Letts 
414:84-88). Constructs may include a multiple cloning site (MCS) for insertion of 

1 5 one or more transgenes, typically at the 3' end of the construct. In addition, a cassette 
encoding a catalytic center epitope derived from the catalytic center in RT is typically 
included 3' of the sequence encoding 6 amino acids of integrase. This cassette encodes 
Ilel78 through Serine 191 of RT and may be added to keep this well conserved region 
as a possible CTL epitope. Further, the constructs contain an insertion mutations to 

20 preserve the reading frame, (see, e.g., Park et al. (1991) /. ViroL 65:51 1 1). 

In certain embodiments, the catalytic center and/or primer grip region of RT 
are modified. The catalytic center and primer grip regions of RT are described, for 
example, in Patel et al. (1995) Biochem, 34:5351 and Palaniappan et al. (1997) /. BioL 
Chem. 272(17):11157. For example, wOd type sequence encoding the amino acids ' 

25 YMDD at positions 1 83-185 of p66 RT, numbered relative to AFl 10975, may be 
replaced with sequence encoding the amino acids "AF'. Further, the primer grip 
region (amino acids WMGY, residues 229-232 of p66RT, numbered relative to 
AFl 10975) may be replaced with sequence encoding the amino acids "PL" 

For the Pol sequence, the changes in codon usage are typically restricted to the 

30 regions up to the -1 frameshift and starting again at the end of the Gag reading fi-ame; 
however, regions within the fi-ameshift translation region can be modified as well. 
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Finally, inhibitory (or instability) elements (INS) located within the coding sequences 
of the protease polypeptide coding sequence can be altered as welL 

Experiments can be performed in support of the present invention to show that 
the synthetic Pol sequences were capable of higher level of protein production relative 
S to the native Pol sequences. Modification of the Pol polypeptide coding sequences 
results in improved expression relative to the wild-type coding sequences in a number 
of mammalian cell lines (as well as other types of cell lines, including, but not limited 
to, insect cells). Similar Pol polypeptide coding sequences can be obtained, modified 
and tested for unproved expression from a variety of isolates, mcluding those described 

1 0 above for Gag and Env. 

The present invention also includes expression cassettes which include 
synthetic sequences derived HIV genes other than Gag, Env and Pol, including but not 
limited to, regions within Gag, Env, Pol, as well as, GagComplPohnut_C, 
GagComplPolmutAtt.C, GagCon:5)lPolmutIna_C, GagComplPohnutlnaTatRevNef^C, 

15 GagPobnut_C, GagPohnutAtt^C, GagPolmutlna^C, GagProtInaRTmut„C, 
GagProtlnaRTmutTatRevNef^C, GagRTmut^C, GagRTmutTatRevNef^C, 
GagTatRevNef^C, gpl20mod.TVl.delll8-210, gpl20mod.TVl.delVlV2, 
gpl20mod.TVl.delV2, gpl40mod.TVl. dell 18-210, gpl40mod.TVl.delVlV2, 
gpl40mod.TVl.delV2, gpl40mod.TVl.mut7, gpl40mod.TVl.tpa2, 

20 gpl40TMmod.TVl, gpl60mod.TVLdelll8-210, gpl60mod.TVl.delVlV2, 

gpl60mod.TVl.delV2, gpl60mod.TVl.dVl, gpl60mod.TVl.dVl-gagmod.BW965, 
gpl60mod.TVLdVlV2-gagmod.BW965,gpl60mod.TVl.dV2-gagmod.BW965, 
gpl60mod.TVLtpa2, gpl60mod.TVl-gagmod.BW965, int.opt.mut_C, int.opt_C, 
nef.D106G.-myrl9.opt_C, pl5RnaseH.opt_C, p2PoLopt.YMWM_C, 

25 p2PoIopt. YM^C, p2Polopt_C, p2PolTatRevNef opt C, 

p2PolTatRevNef.opt.native_C, p2PolTatRevNef.opt_C, protInaRT.YM.opt_C, 
protInaRT.YMWM.opt_C, ProtRT.TatRevNef.opt^C, rev.exonL2.M5-10.opt_C, 
tat.exonl_2.opt.C22-37_C, tat.exonl_2.opt.C37_C, TatRevNef.opt.native^, 
TatRevNef.opt^ZA, TatRevNefGag C, TatRevNefgagCpolIna C, 

30 TatRevNefGagProtlnaRTmut C, and TatRevNefProtRT opt C. Sequences obtained 
from other strains can be manipulated in similar fashion following the teachings of the 
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present specification. As noted above, the codon usage pattern is modified as 
described above for Gag, Env and Pol so that the resulting nucleic acid coding 
sequence is comparable to codon usage found in highly expressed human genes. 
Typically these synthetic sequences are capable of higher level of protein production 
S relative to the native sequences and that modification of the wild-type polypeptide 
coding sequences results in improved expression relative to the wild-type coding 
sequences in a number of mammalian cell lines (as well as other types of cell lines, 
including, but not limited to, insect cells). Furthermore, the nucleic acid sequence can 
also be modified to introduce mutations into one or more regions of the gene, for 

10 instance to alter the function of the gene product (e.g., render the gene product non- 
functional) and/or to eliminate site modifications (e.g., the myristoylation site in Nef). 

Synthetic expression cassettes, derived from HIV Type C coding sequences, 
exemplified herein include, but are not limited to, those comprising one or more of the 
following synthetic polynucleotides: GagCompIPolmut_C, GagComplPolmutAtt_C, 

15 GagComplPolmutIna_C, GagComplPolmutInaTatRevNef_C, GagPolmut^C, 
GagPoImutAtt.C, GagPolmutIna_C, GagProanaRTmut_C, 
GagProtInaRTmutTatRevNef_C, GagRTmut_C, GagRTmutTatRevNef_C, 
GagTatRevNef_C, gpl20mod.TVl.deIl 18-210, gpl20mod.TVl.delVlV2, 
gpl20mod.TVl.deIV2, gpl40inod.TVl.dell 18-210, gpl40mod.TVl.delVlV2. 

20 gpl40mod.TVl.deIV2, gpl40mod.TVl.mut7. gpl40mod.TVl.tpa2, 

gpl40TMmod.TVl, gpl60mod.TVl.delll8-210, gpl60mDd.TVl.deIVlV2, 
gpl60mod.TVl.delV2. gpl60mDd.TVl.dVl, gpl60mod.TVl.dVl-gagmod.BW965, 
gpl 60mod.TVl .dV 1 V2-gagmod.BW965. gpl60mod.TVl .dV2-gagmod.BW965, 
gpl60mod.TVl.tpa2, gpl60mod.TVl-gagmod.BW965, mt.opt.mut_C, int.opt_C, 

25 nef.D106G.-myrl9.opt_C, pl5RnaseH.opt_C, p2Pol.opt.YMWM_C, 
p2Polopt.YM_C, p2PoIopt_C, p2PolTatRevNef opt C, 
p2PolTatRevNef.opt.native_C, p2PolTatRevNef.opt_C, protInaRT.YM.opt_C, 
protInaRT.YMWM.opt_C, ProtRT.TatRevNef.opt_C, rev.exonl_2.M5-10.opt_C, 
tat.exonl_2.opt.C22-37_C, tat.exonl_2.opt.C37_C, TatRevNef.opt.native_ZA, 

30 TatRevNef.opt_ZA, TatRevNeflGag C, TatRevNefgagQ)onna C, 
TatRevNefGagProtlnaRTmut C, and TatRevNefProtRT opt C. 
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Gag-complete refers to in-frame polyproteins coii:5)rising, e.g.. Gag and pol, 
wherein the p6 portion of Gag is present. 

Additional sequences that may be enq)loyed in some aspects of the present 
invention have been described in WO 00/39302, WO 00/39303, WO 00/39304, and 
5 WO 02/04493. 

2.2.2 Further Modification of Sequences Including HIV Nucleic 
Acid Coding Sequences 

The HIV polypeptide-encoding expression cassettes described herein may also 
10 contain one or more further sequences encoding, for exanq)le, one or more transgenes. 
Rirther sequences (e,g,, transgenes) useful in the practice of the present invention 
include, but are not limited to, further sequences are those encoding further viral 

epitopes/antigens {including but not limited to, HCV antigens (e.g., El, E2; 
Houghton, M.., et al, U.S. Patent No. 5,714,596, issued February 3, 1998; Houghton, 

15 M.., et al, U.S. Patent No. 5,712,088, issued January 27, 1998; Houghton, M.., et al.. 
U.S. Patent No. 5,683,864, issued November 4, 1997; Weiner, A.J., et al., U.S. Patent 
No. 5,728,520, issued March 17, 1998; Weiner, A.J., et al., U.S. Patent No. 
5,766,845, issued June 16, 1998; Weiner, A.J., et al., U.S. Patent No. 5,670,152, 
issued September 23, 1997), HIV antigens (e.g., derived jfrom one or more HTV 

20 isolate); and sequences encodmg tumor antigens/epitopes. Further sequences may also 
be derived from non-viral sources, for instance, sequences encoding cytokines such 
interleukin-2 (IL-2), stem cell factor (SCF), interleukin 3 (IL-3), interleukin 6 (IL-6), 
interleukin 12 (IL-12), G-CSF, granulocyte macrophage-colony stimulating factor 
(GM-CSF), interleukin-1 alpha (IL-II). interleukin- 11 (DL-ll), MDP-II, tumor necrosis 

25 factor (TNF), leukemia inhibitory factor (LIF), c-kit ligand, thrombopoietin (TPO) and 
flt3 ligand, commercially available from several vendors such as, for exaiq»le, 
Genzyme (Framingham, MA), Genentech (South San Francisco, CA), Amgen 
(Thousand Oaks, CA), R&D Systems and Immunex (Seattle, WA). Additional 
sequences are described below. Also, variations on the orientation of the Gag and 

30 other coding sequences, relative to each other, are described below. 
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HTV polypeptide coding sequences can be obtained from other HIV isolates, 
see, e.g., Myers et al. Los Alamos Database, Los Alamos National Laboratory, Los 
Alamos, New Mexico (1992); Myers et al., Human Retroviruses and Aids, 1997, Los 
Alamos, New Mexico: Los Alamos National Laboratory. Synthetic expression 
S cassettes can be generated using such coding sequences as starting material by 
following the teachings of the present specification. 

Further, the synthetic expression cassettes of the present invention include 
related polypeptide sequences having greater than 85%, preferably greater than 90%, 
more preferably greater than 95%, and most preferably greater than 98% sequence 
10 identity to the polypeptides encoded by the synthetic expression cassette sequences 
disclosed herein. 

Exemplary expression cassettes and modifications are set forth in Example 1. 



2.2.3 Expression of Synthetic Sequences Encoding HIV-1 

15 Polypeptides and Related Polypeptides 

Synthetic HIV-encoding sequences (expression cassettes) of the present 
invention can be cloned into a number of different expression vectors to evaluate levels 
of expression and, in the case of Gag-containing constructs, production of VLPs. The 
synthetic DNA fragments for HIV polypeptides can be cloned mto eucaryotic 

20 expression vectors, including, a transient expression vector, CMV-promoter-based 
mammalian vectors, and a shuttle vector for use in baculovirus expression systems. 
Corresponding wild-type sequences can also be cloned into the same vectors. 

These vectors can then be transfected into a several different cell types, 
including a variety of mammalian cell lines (293, RD, COS-7, and CHO, cell lines 

25 available, for example, from the A.T.C.C.). The cell lines are then cultured under 
appropriate conditions and the levels of any appropriate polypeptide product can be 
evaluated in supematants. (see. Table A). For example, p24 can be used to evaluate 
Gag expression; gpl60, gpl40 or gpl20 can be used to evaluate Env e;q)ression; 
p6pol can be used to evaluate Pol expression; prot can be used to evaluate protease; 

30 p 15 for RNAseH; p3 1 for Integrase; and other appropriate polypeptides for Vif, Vpr, 
Tat, Rev, Vpu and Nef. Further, modified polypeptides can also be used, for example, 
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Other Env polypeptides include, but are not limited to, for exannple, native gpl60, 
oligomeric gpl40, monomeric gpl20 as well as modified and/or synthetic sequences of 
these polypeptides. The results of these assays demonstrate that expression of 
synthetic HIV polypeptide-encoding sequences are significantly higher than 
S corresponding wild-type sequences. 

Further, Western Blot analysis can be used to show that cells containing the 
synthetic expression cassette produce the expected protein at higher per-cell 
concentrations than cells containing the native expression cassette. The HTV proteins 
can be seen in both cell lysates and supematants. The levels of production are 
10 significantly higher in cell supematants for cells transfected with the synthetic 
expression cassettes of the present invention. 

Fractionation of the supematants firom mammalian cells transfected with the 
synthetic expression cassette can be used to show that the cassettes provide superior 
production of HIV proteins and, in the case of Gag, VLPs, relative to the wild-type 
15 sequences. 

Efficient expression of these HIV-containing polypeptides in mammalian cell 
lines provides the following benefits: the polypeptides are free of baculovirus 
contaminants; production by established methods approved by the FDA; increased 
purity; greater yields (relative to native coding sequences); and a novel method of 

20 producing the Sub HIV-containing polypeptides in CHO cells which is not feasible in 
the absence of the increased expression obtained using the constructs of the present 
invention. Exemplary Mammalian cell lines include, but are not limited to, BHK, 
VERO, HT1080, 293, 293T, RD, COS-7, CHO, Jurkat, HUT, SUPT, C8166, 
MOLT4/clone8, MT-2, MT-4, H9, PMl, CEM, and CEMX174 (such ceU lines are 

25 available, for example, fi:om the A.T.C.C.). 

A synthetic Gag expression cassette of the present invention will also exhibit 
high levels of expression and VLP production when transfected into insect cells. 
Synthetic expression cassettes described herein also demonstrate high levels of 
expression in insect cells. Further, in addition to a higher total protein yield, the final 

30 product fi-om the synthetic polypeptides consistently contains lower amounts of 

contaminating baculovirus proteins than the final product from the native sequences. 
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Further, synthetic expression cassettes of the present invention can also be 
introduced into yeast vectors which, in turn, can be transformed into and efficiently 
expressed by yeast cells (Saccharomyces cerevisea; using vectors as described in 
Rosenberg, S. and Tekamp-Olson, P., U.S. Patent No. RE35,749, issued, March 17, 
5 1998). 

In addition to the mammalian and insect vectors, the synthetic expression 
cassettes of the present invention can be incorporated into a variety of expression 
vectors using selected expression control elements. Appropriate vectors and control 
elements for any given cell an be selected by one having ordinary skill in the art in view 

10 of the teachings of the present specification and information known in the art about 
expression vectors. 

For example, a synthetic expression cassette can be inserted into a vector 
which includes control elements operably linked to the desired coding sequence, which 
allow for the expression of the gene in a selected cell-type. For example, typical 

15 promoters for mammalian cell expression include the SV40 early promoter, a CMV 
promoter such as the CMV immediate early promoter (a CMV promoter can inchide 
intron A), RSV, HTV-Ltr, the mouse mammary tumor virus LTR promoter (MMLV- 
Itr), the adenovirus major late promoter (Ad MLP), and the herpes simplex virus 
promoter, among others. Other nonviral promoters, such as a promoter derived firom 

20 the murine metallothionein gene, will also find use for mammalian expression. 

TypicaDy, transcription termination and poiyadenylation sequences will also be present, 
located 3' to the translation stop codon. Preferably, a sequence for optimization of 
initiation of translation, located 5' to the coding sequence, is also present. Exanq)les 
of transcription terminator/polyadenylation signals include those derived firom SV40, 

25 as described in Sanibrook, et al., supra, as well as a bovine growth hormone 

terminator sequence. Introns, containing splice donor and acceptor sites, n^y also be 
designed into the constructs for use with the present invention (Chapman et al., Nuc. 
Acids Res, (1991) 12:3979-3986). 

Enhancer elements may also be used herein to increase expression levels of the 

30 mammalian constructs. Examples include the SV40 early gene enhancer, as described 
in Dijkema et al., EMBO J. (1985) 4:761, the enhancer/promoter derived from the 
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long terminal repeat (LTR) of the Rous Sarconia Virus, as described in Gorman et al., 
Proc, Natl Acad. ScL USA (1982b) 79:6777 and elements derived from human CMV, 
as described in Boshart et al., Cell (1985) 41:521, such as elements included in the 
CMV intron A sequence (Chapman et al., Nuc. Acids Res. (1991) 19:3979-3986). 
S The desired synthetic polypeptide encoding sequences can be cloned into any 

number of commercially available vectors to generate expression of the polypeptide in 
an appropriate host system. These systems include, but are not limited to, the 
following: baculovirus expression {Reilly, P.R., et al, Baculovirus Expression 
Vectors: A Laboratory Manual (1992); Beames, et al, Biotechniques 11:378 

10 ( 1 99 1 ); Pharmingen; Clontech, Palo Alto, CA) }, vaccinia expression {Earl, P. L., et 
al, "Expression of proteins in mammalian cells using vaccinia" In Current Protocols 
in Molecular Biology (F, M. Ausubel, etal Eds.), Greene Publishing Associates & 
Wiley Interscience, New York (1991); Moss, B., et al, U.S. Patent Number 
5,135,855, issued 4 August 1992}, expression in bacteria {Ausubel, P.M., etal, 

15 Current Protocols in Molecular Biology. John Wiley and Sons, Inc., Media 

PA; Clontech}, expression in yeast {Rosenberg, S. and Tekamp-Olson, P., U.S. Patent 
No. RE35,749, issued, March 17, 1998; Shuster, J.R., U.S. Patent No. 5,629,203, 
issued May 13, 1997; Gellissen, G., etal,Antonie Van Leeuwenhoek, 62(l"2):79-93 
(1992); Romanos, M.A., et al, Zea^/ 8(6):423-488 (1992); Goeddel, D.V., Methods 

20 in Enzymology 185 (1990); Guthrie, C, and G.R. Fink, Methods in Enzymology 194 
(1991)}, expression in mammalian cells {Clontech; Gibco-BRL, Ground Island, NY; 
e.g., Chinese hamster ovary (CHO) cell lines (Haynes, J., et al,, Nuc, Acid. Res, 
11:687-706 (1983); 1983, Lau, Y.F., etal., Mol CeU, Biol 4:1469-1475 (1984); 
Kaufinan, R. J., "Selection and coamplification of heterologous genes in maTnTpaljaTi 

25 cells," in Methods in Enzymology, vol. 185, pp537-566. Academic Press, Inc., San 
Diego CA (1991)}, and expression in plant cells {plant cloning vectors, Clontech 
Laboratories, Inc.. Palo Alto, CA, and Pharmacia LKB Biotechnology, Inc., 
Pistcataway, NJ; Hood, E., etal, J, Bactenol 168:1291-1301 (1986); Nagel, R., et 
al, FEMS Microbiol Lett, ^:325 (1990); An, et al, •'Binary Vectors", and others in 

30 Plant Molecular Biologv Manual A3: 1-19 (1988); Miki, B.L.A., et al, pp.249-265, 
and others in Plant DNA Infectious Agents (Hohn, T., et al, eds.) Springer- Verlag, 
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Wien, Austria, (1987); Plant Molecular Biology: Essential Techniques, P.G. Jones 
and J.M. Sutton, New York, J. Wiley, 1997; Miglani, Gurbachan Dictionary of Plant 
Genetics and Molecular Biology, New York, Food Products Press, 1998; Henry, R. 
J., Practical Applications of Plant Molecular Biology, New York, Chapman & Hall, 
5 1997}. 

Also included in the invention is an expression vector, containing coding 
sequences and expression control elements which allow expression of the coding 
regions in a suitable host. The control elenxents generally include a promoter, 
translation initiation codon, and translation and transcription termination sequences, 
10 and an insertion site for introducing the insert into the vector. Translational control 
elements have been reviewed by M. Kozak (e.g., Kozak, M., Mamnt Genome 
7(8):563-574, 1996; Kozak, M., Biochimie 76(9):815-821, 1994; Kozak, M., J Cell 
Biol 108(2):229-241, 1989; Kozak, M., and Shatkin, A,J., Methods Enzymol 
60:360-375, 1979). 

15 Expression in yeast systems has the advantage of commercial production. 

Recombinant protein production by vaccinia and CHO cell line have the advantage of 
being mammalian expression systems. Further, vaccinia virus expression has several 
advantages including the following: (i) its wide host range; (ii) faithful post- 
transcriptional modification, processing, folding, transport, secretion, and assembly of 

20 recombinant proteins; (iii) high level expression of relatively soluble recombinant 
proteins; and (iv) a large capacity to accommodate foreign DNA. 

The recombinantly expressed polypeptides from synthetic HTV polypeptide- 
encoding expression cassettes are typically isolated from lysed cells or culture media. 
Purification can be carried out by methods known in the art including salt 

25 fractionation, ion exchange chromatography, gel filtration, size-exclusion 

chromatography, size-fractionation, and afSnity chronoatography. ImmunoafBnity 
chromatography can be employed using antibodies generated based on, for example, 
HIV antigens. 

Advantages of expressing the proteins of the present invention using 
30 mammalian cells mclude, but are not limited to, the following: well-established 

protocols for scale-up production; the abihty to produce VLPs; cell lines are suitable to 
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meet good manufacturing process (GMP) standards; culture conditions for mammalian 
cells are known in the art. 

Synthetic HIV 1 polynucleotides are described herem, see, for example, the 
figures. Various forms of the different embodiments of the invention, described herein, 
5 may be combined. 

Exemplary expression assays are set forth in Exaiqple 2. Exenq)lary conditions 
for Western Blot analysis are presented in Example 3. 

2.3.0 Production of Virus-like Particles and Use of the 

1 0 Constructs of the Present Invention to create Packaging 

CELL lines. 

The group-specific antigens (Gag) of human immunodeficiency virus type-1 
(HTV-l) self-assemble into noninfectious virus-like particles (VLP) that are released 
firom various eucaryotic cells by budding (reviewed by Freed, E.O., Virology 251: 1-15, 
15 1998). The Gag-containing synthetic expression cassettes of the present invention 
provide for the production of HIV-Gag vims-like particles (VLPs) using a variety of 
different cell types, including, but not limited to, mammalian cells. 

Viral particles can be used as a matrix for the proper presentation of an antigen 
entrapped or associated therewith to the immune system of the host. 

20 

2.3.1 VLP Production using the synthetic expression cassettes 
of the present invention 

The Gag-containing synthetic expression cassettes of the present invention naay 
provide superior production of both Gag proteins and VLPs, relative to native Gag 
25 coding sequences. Further, electron microscopic evaluation of VLP production can be 
used to show that free and budding immature virus particles of the expected size are 
produced by cells contaming the synthetic expression cassettes. 

Using the synthetic expression cassettes of the present invention, rather than 
native Gag coding sequences, for the production of virus-like particles provide several 
30 advantages. First, VLPs can be produced in enhanced quantity making isolation and 
purification of the VLPs easier. Second, VLPs can be produced in a variety of cell 
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types using the synthetic expression cassettes, in particular, manmialian cell lines can 
be used for VLP production, for exan5)le, CHO cells. Production using CHO cells 
provides (i) VLP formation; (ii) correct myristoylation and budding; (iii) absence of 
non-mamallian cell contaniinants (e.g., insect viruses and/or cells); and (iv) ease of 
S purification. The synthetic expression cassettes of the present invention are also useful 
for enhanced eiqpression in cell-types other than mammalian cell lines. For exanq>le, 
infection of insect ceUs with baculovirus vectors encoding the synthetic expression 
cassettes results in higher levels of total Gag protein yield and higher levels of VLP 
production (relative to wM-oding sequences). Further, the final product from insect 
10 cells infected with the baculovirus-Gag synthetic expression cassettes consistently 
contains lower amounts 

of contaminating insect proteins than the final product when wild*oding sequences are 
used. 

VLPs can spontaneously form when the particle-forming polypeptide of 
15 interest is recombinantly expressed in an appropriate host cell. Thus, the VLPs 
produced using the synthetic expression cassettes of the present invention are 
conveniently prepared using recombinant techniques. As discussed below, the Gag 
polypeptide encoding synthetic expression cassettes of the present invention can 
include other polypeptide coding sequences of interest (for example, HIV protease, 
20 HIV polymerase, Env; synthetic Env). Expression of such synthetic expression 

cassettes yields VLPs comprising the Gag polypeptide, as well as, the polypeptide of 
interest. 

Once coding sequences for the desired particle-forming polypeptides have been 
isolated or synthesized, they can be cloned into any suitable vector or replicon for 

25 expression. Numerous cloning vectors are known to those of skill in the art, and the 
selection of an appropriate cloning vector is a matter of choice. See, generally, 
Sanobrook et al, supra. The vector is then used to transform an appropriate host cell. 
Suitable recombinant expression systems include, but are not limited to, bacterial, 
mammalian, baculovirus/insect, vaccinia, Semliki Forest virus (SFV), Alphaviruses 

30 (such as, Sindbis, Venezuelan Equine Encephalitis (VEE)), mammalian, yeast and 

Xenopus expression systems, weD known in the art. Particularly preferred expression 
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systems are mammalian cell lines, vaccinia, Sindbis, eucaryotic layered vector initiation 
systems (e.g., US Patent No. 6,015,686, US Patent No. 5, 814,482, US Patent No. 
6,015,694, US Patent No. 5,789,245, EP 1029068A2, WO 9918226A2/A3, EP 
00907746A2, WO 9738087 A2), insect and yeast systems. 
5 The synthetic DNA fragments for the expression cassettes of the present 

invention, e.g., Pol, Gag, Env, Tat, Rev, Nef, Vif, Vpr, and/or Vpu, may be cloned 
into the following eucaryotic expression vectors: pCMVKm2, for transient expression 
assays and DNA immunization studies, the pCMVKna2 vector is derived from 
pCMV6a (Chapman et aL, Nuc. Acids Res. (1991) 12:3979-3986) and con:5)rises a 

10 kanamycin selectable marker, a CoEl origin of replication, a CMV promoter enhancer 
and Intron A, foUowed by an insertion site for the synthetic sequences described below 
followed by a polyadenylation signal derived from bovme growth hormone — the 
pCMVKm2 vector differs from the pCMV-link vector only in that a polylinker site is 
inserted into pCMVKm2 to generate pCMV-link; pBSN2dhfr and pCMVPLEdhfr, for 

15 expression in Chinese Hamster Ovary (CHO) cells; and, pAcClS, a shuttle vector for 
use in the Baculovirus expression system (pAcClS, is derived from pAcC12 which is 
described by Munemitsu S., et al., Mol Cell Biol 10(ll):5977-5982, 1990). 
Briefly, construction of pCMVPLEdhfr was as follows. 
To construct a DHFR cassette, the EMCV IRES (internal ribosome entry site) 

20 leader was PCR-amplified from pCite-4a+ (Novagen, Inc., Milwaukee, WI) and 

inserted into pET"23d (Novagen, Inc., Milwaukee, WI) as anXba-Nco fragment to 
give pET'EMCV. The dhfr gene was PCR-amplified from pESN2dhfr to give a 
product with a Gly-Gly-Gly-Ser spacer in place of the translation stop codon and 
inserted as an Nco-BamRl fragment to give pET-E-DHFR. Next, the attenuated neo 

25 gene was PCR amplified from a pS V2Neo (Clontech, Palo Alto, CA) derivative and 
inserted into the unique JJamHl site of pET-E-DHFR to give pET-E-DHFR/Neo^^j, 
Finally the bovine growth hormone terminator from pCDNAS (Invitrogen, Inc., 
Carlsbad, CA) was inserted downstream of the neo gene to give pET-E- 
DHFR/NeO(^)BGHt. The EMCY-dhfr/neo selectable marker cassette fragment was 

30 prepared by cleavage of pET-E-DHFR/NeO(^,BGHt. 
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In one vector construct the CMV enhancer/promoter plus Intron A was 
transferred from pCMV6a (Chapman et al., Nuc, Acids Res. (1991) 19:3979-3986) as 
a Hindni'Sall fragment into pUC19 (New England Biolabs, Inc., Beverly, MA). The 
vector backbone of pUC19 was deleted from the Ndel to the Sapl sites. The above 
5 described DHFR cassette was added to the construct such that the EMCV IRES 
followed the CMV promoter. The vector also contained an anq)' gene and an S V40 
origin of replication. 

A number of mammalian cell lines are known in the art and mclude immortal- 
ized cell lines available from the American Type Culture Collection (A.T.C.C.), such 

10 as, but not limited to, Chinese hamster ovary (CHO) cells, HeLa cells, baby hamster 
kidney (BHK) cells, monkey kidney cells (COS), as well as others. Similarly, bacterial 
hosts such as E, coU, Bacillus subtilis, and Streptococcus spp., will find use with the 
present expression constructs; Yeast hosts useful in the present invention include inter 
alia, Saccharomyces cerevisiae, Candida albicans, Candida maltosa, Hansenula 

15 polymorpha, Kluyveromyces fragilis, Kluyveromyces lactis, Pichia guillerimondii, 

Pichia pastoris, Schizosaccharoinyces pombe and Yarrowia lipolytica. Insect cells for 
use with baculovirus expression vectors include, inter alia, Aedes aegypti, Autographa 
califoniica, Bombyx mori, Drosophila melanogaster, Spodopterafrugiperda, and 
Trichoplusia ni. See, e.g.. Summers and Smith, Texas Agricultural Experiment Station 

20 Bulletin No, 1555 ( 1987). 

Viral vectors can be used for the production of particles in eucaryotic cells, 
such as those derived from the pox family of viruses, including vaccinia virus and avian 
poxvirus. Additionally, a vaccinia based infection/transfection systeni, as described in 
Tomei et al., /. Virol (1993) 67:4017-4026 and Selby et al., J. Gen. Virol (1993) 

25 74: 1 103-1 1 13, will also find use with the present invention. In this system, cells are 
first infected in vitro with a vaccinia virus recombinant that encodes the bacteriophage 
T7 RNA polymerase. This polymerase displays exquisite specificity in that it only 
transcribes ten^lates bearing T7 promoters. Following infection, cells are transfected 
with the DNA of interest, driven by a T7 promoter. The polymerase expressed in the 

30 cytoplasm from the vaccinia virus recombinant transcribes the transfected DNA into 
RNA which is then translated into protein by the host translational machinery. 



61 



wo 03/004620 



PCTAJSD2/21420 



Alternately, T7 can be added as a purified protein or enzyme as in the '^Progenitor" 
• system (Studier and Moffatt, /. Mol Biol (1986) 189: 1 13-130). The method 
provides for high level, transient, cytoplasmic production of large quantities of RNA 
and its translation product(s). 
5 Depending on the expression system and host selected, the VLPS are produced 

by growing host cells transformed by an expression vector under conditions whereby 
the particle-forming polypeptide is expressed and VLPs can be formed. The selection 
of the appropriate growth conditions is within the skill of the art. If the VLPs are 
formed intracellularly, the cells are then disrupted, using chemical, physical or 
10 mechanical means, which lyse the cells yet keep the VLPs substantially intact. Such 
methods are known to those of skill in the art and are described in, e.g., Protein 
Purification Applications: A Practical Approach^ (E.L.V. Harris and S. Angal, Eds., 
1990). 

The particles are then isolated (or substantially purified) using methods that 

1 5 preserve the integrity thereof, such as, by gradient centrifugation, e.g., cesium chloride 
(CsCl) sucrose gradients, pelleting and the like (see, e.g., Kimbauer et al. 7. Virol 
(1993) 67:6929-6936), as well as standard purification techniques including, e.g., ion 
exchange and gel filtration chromatography. 

VLPs produced by cells containing the synthetic expression cassettes of the 

20 present invention can be used to elicit an immune response when administered to a 
subject. One advantage of the present invention is that VLPs can be produced by 
mammalian cells carrying the synthetic expression cassettes at levels previously not 
possible. As discussed above, the VLPs can comprise a variety of antigens in addition 
to the Gag polypeptide (e.g., Gag-protease, Gag-polymerase, Env, synthetic Env, 

25 etc.). Purified VLPs, produced using the synthetic expression cassettes of the present 
invention, can be administered to a vertebrate subject, usually in the form of vaccine 
compositions. Combmation vaccines may also be used, where such vaccines contain, 
for example, an adjuvant subunit protein (e.g., Env). Administration can take place 
using the VLPs formulated alone or formulated with other antigens. Further, the 

30 VLPs can be administered prior to, concurrent with, or subsequent to, delivery of the 
synthetic expression cassettes for DNA immunization (see below) and/or delivery of 
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Other vaccines. Also, the site of VLP admioistration may be the same or different as 
other vaccine compositions that are being administered. Gene delivery can be 
accomplished by a number of methods including, but are not limited to, immunization 
with DNA, alphavirus vectors, pox virus vectors, and vaccinia virus vectors. 
5 VLP immune-stimulating (or vaccine) compositions can include various 

excipients, adjuvants, carriers, auxiliary substances, modulating agents, and the like. 
The immune stimulating compositions will include an amount of the VLP/antigen 
sufQcient to mount an immunological response. An appropriate effective amount can 
be determined by one of skill in the art. Such an amount will fall in a relatively broad 

10 range that can be determined through routine trials and will generally be an amount on 
the order of about 0. 1 jig to about 1000 (ig, more preferably about 1 |Lig to about 300 
Jig, of VLP/antigen. 

A carrier is optionally present which is a molecule that does not itself induce 
the production of antibodies harmful to the individual receiving the composition. 

15 Suitable carriers are typically large, slowly metabolized macromolecules such as 

proteins, polysaccharides, polylactic acids, polyglycollic acids, polymeric amino acids, 
amino acid copolymers, lipid aggregates (such as oil droplets or liposomes), and 
inactive virus particles. Examples of particulate carriers include those derived from 
polymethyl methacrylate polymers, as well as microparticles derived from 

20 poly(lactides) and poly(lactide-co-glycolides), known as PLG. See, e.g., Jeffery et al., 
Phann. Res, (1993) 10:362-368; McGee JP, et al., J Microencapsul 14(2): 197-210, 
1997; O'Hagan DT, et al., Vaccine 11(2): 149-54, 1993. Such carriers are well known 
to those of ordinary skill in the art. Additionally, these carriers may function as 
immunostunulating agents ("adjuvants")- Furthermore, the antigen may be conjugated 

25 to a bacterial toxoid, such as toxoid from dq>htheria, tetanus, cholera, etc., as well as 
toxins derived from£. coll 

Adjuvants may also be used to enhance the effectiveness of the compositions. 
Such adjuvants include, but are not limited to: (1) aluminum salts (alum), such as 
aluminum hydroxide, aluminum phosphate, aluminum sulfate, etc,, (2) oil-in- water 

30 emulsion formulations (with or without other specific immunostimulating agents such 
as muramyl peptides (see below) or bacterial cell wall conq)onents), such as for 
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example (a) MF59 (International Publication No. WO 90/14837), containing 5% 
Squalene, 0.5% Tween 80, and 0.5% Span 85 (optionally containing various amounts 
of MTP-PE (see below), although not required) fomuilated into submicron particles 
using a microfluidizer such as Model 1 lOY microfluidizer (Microfluidics, Newton, 
5 MA), (b) SAF, containing 10% Squalane, 0.4% Tween 80, 5% pluronic-blocked 
polymer L121 , and thr-MDP (see below) either microfhudized into a submicron 
emulsion or vortexed to generate a larger particle size emulsion, and (c) Ribi™ 
adjuvant system (RAS), (Ribi Immunochem, Hamilton, MT) containing 2% Squalene, 
0.2% Tween 80, and one or more bacterial cell wall components from the group 

10 consisting of monophosphorylipid A (MPL), trehalose dimycolate (TDM), and ceU 
wall skeleton (CWS), preferably MPL + CWS (Detox™); (3) saponin adjuvants, such 
as Stimulon™ (Cambridge Bioscience, Worcester, MA) may be used or particle 
generated therefrom such as ISCOMs (immunostimulating complexes); (4) Complete 
Freunds Adjuvant (CFA) and Incomplete Freunds Adjuvant (IFA); (5) cytokines, such 

15 as interleukins (IL-1, IL-2, etc.), macrophage colony stimulating factor (M-CSF), 
tumor necrosis factor (TNF), etc.; (6) oligonucleotides or polymeric molecules 
encoding immunostimulatory CpG mofifs (Davis, H.L., et al., 7. Immunology 160:870- 
876, 1998; Sato, Y. et al., Science 273:352-354, 1996) or complexes of 
antigens/oligonucleotides {Polymeric molecules include double and single stranded 

20 RNA and DNA, and backbone modifications thereof, for example, methylphosphonate 
linkages; or (7) detoxified mutants of a bacterial ADP-ribosylating toxin such as a 
cholera toxin (CT), a pertussis toxin (PT), or an E. coli heat-labile toxin (LT), 
particularly LT-K63 (where lysine is substituted for the wild-type amino acid at 
position 63) LT-R72 (where arginine is substituted for the wild-type amino acid at 

25 position 72), CT-S 109 (where serine is substituted for the wild-type amino acid at 
position 109), and PT-K9/G129 (where lysine is substituted for the wild-type amino 
acid at position 9 and glycine substituted at position 129) (see, e.g.. International 
Publication Nos. W093/13202 and W092/19265); and (8) other substances that act as 
immunostimulating agents to enhance the efEwstiveness of the composition. Further, 

30 such polymeric molecules include alternative polymer backbone structures such as, but 
not limited to, polyvinyl backbones (Pitha, Biochem BiophysActa, 204:39. 1970a; 
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Pitha, BiopolymerSy 9:965, 1970b), and morpKolino backbones (Summerton, J., etaL, 
U.S. Patent No. 5,142.047, issued 08/25/92; Summerton, J., etal,, U.S. Patent No. 
5, 1 85,444 issued 02/09/93). A variety of other charged and uncharged polynucleotide 
analogs have been reported. Numerous backbone modifications are known in the art, 
5 including, but not limited to, uncharged linkages (e.g., methyl phosphonates, 
phosphotriesters, phosphoamidates, and carbamates) and charged linkages (e.g., 
phosphorothioates and phosphorodithioates).}; and (7) other substances that act as 
immunostimulating agents to enhance the ejBFectiveness of the VLP immune-stimulating 
(or vaccine) composition. Alum, CpG oligonucleotides, and MF59 are preferred. 

10 Muramyl peptides include, but are not limited to, N-acetyl-muramyl-L- 

threonyl-D-isoglutamine (thr-MDP), N-acteyl-normuramyl-L-alanyl-D-isogluatme 
(nor-MDP),N-acetylmuramyl-I^alanyl-D-isogluatminyl-L-alanine 
^n-glycero-3-huydroxyphosphoryloxy)-ethylamine (MTP-PE), etc. 

Dosage treatment with the VLP composition may be a single dose schedule or 

15 a multiple dose schedule. A multiple dose schedule is one in which a primary course of 
vaccination may be with 1-10 separate doses, followed by other doses given at 
subsequent time intervals, chosen to maintain and/or reinforce the immune response, 
for example at 1-4 months for a second dose, and if needed, a subsequent dose(s) after 
several months. The dosage regimen will also, at least in part, be determined by the 

20 need of the subject and be dependent on the judgment of the practitioner. 

If prevention of disease is desired, the antigen carrying VLPs are generally 
administered prior to primary infection with the pathogen of interest. If treatment is 
desired, e.g., the reduction of symptoms or recurrences, the VLP compositions are 
generally administered subsequent to primary infection. 

25 

2.3.2 USING THE SYNTBDETIC EXPRESSION CASSETTES OF THE PRESENT 

INVENTION TO CREATE PACKAGING CELL LINES 
A number of viral based systems have been developed for use as gene transfer 
vectors for mammalian host cells. For example, retroviruses (in particular, lentiviral 
30 vectors) provide a convenient platform for geiie delivery systems. A coding sequence 
of interest (for example, a sequence useful for gene therapy applications) can be 
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inserted into a gene delivery vector and packaged in retroviral particles using 
techniques known in the art. Recombinant virus can then be isolated and delivered to 
cells of the subject either in vivo or ex vivo. A number of retroviral systems have been 
described, including, for example, the following: (U.S. Patent No. 5,219,740; Miller et 
5 al, (1989) BioTechniques 7:980; Miller, A,D. (1990) Human Gene Therapy 1:5; 

Scarpa et al. (1991) Virology 180:849; Bums et al. (1993) Proc, Natl Acad. ScL USA 
90:8033; Boris-Lawrie et aL (1993) Cur. Opin. Genet. Develop. 3:102; GB 2200651; 
EP 0415731; EP 0345242; WO 89/02468; WO 89/05349; WO 89/09271; WO 
90/02806; WO 90/07936; WO 90/07936; WO 94/03622; WO 93/25698; WO 

10 93/25234; WO 93/1 1230; WO 93/10218; WO 91/02805; in U.S. 5.219,740; U.S. 
4,405,712; U.S. 4,861.719; U.S. 4,980.289 and U.S. 4.777.127; in U.S. Serial No. 
07/800,921; and in Vile (1993) Cancer Res 53:3860-3864; Vile (1993) Cancer Res 
53:962-967; Ram (1993) Cancer Res 53:83-88; Takamiya (1992) JNeurosci Res 
33:493-503; Baba (1993) JNeurosurg 79:729-735; Mann (1983) Cell 33:153; Cane 

15 (1984) Proc Natl Acad Sci USA 8i;6349; and Miller (1990) Hunmn Gene Therapy 1. 

In other embodiments, gene transfer vectors can be constructed to encode a 
cytokine or other immunomodulatory molecule. For example, nucleic acid sequences 
encoding native IL-2 and gamma-interferon can be obtained as described in US Patent 
Nos. 4,738,927 and 5,326,859, respectively, while useful muteins of these proteins can 

20 be obtained as described in U.S. Patent No. 4,853,332. Nucleic acid sequences 

encoding the short and long forms of mCSF can be obtained as described in US Patent 
Nos. 4,847,201 and 4,879,227, respectively. In particular aspects of the invention, 
retroviral vectors expressing cytokine or immunomodulatory genes can be produced as 
described herein (for example, enq)loying the packaging ceU lines of the present 

25 invention) and in International Application No. PCT US 94/02951, entitled 
"Compositions and Methods for Cancer Immunotherapy," 

Examples of suitable immunomodulatory molecules for use herein include the 
following: IL-1 and IL-2 (Karupiah et al. (1990) /. Immunolosv 144:290-298. Weber 
et al. (1987) /. Exp. Med. 166:1716-1733, Gansbacher et al. (1990) /. Ejq?. Med. 

30 122: 1217-1224, and U.S. Patent No. 4,738,927); IL-3 and IL-4 (Tepper et al. (1989) 
Cell 57:503-512, Golumbek et al. (1991) Science 254:713-716, and U.S. Patent No. 
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5,017,691); IL-5 and IL-6 (Brakenhof et al. (1987) 7. Immunol 139:4116-4121, and 
International Publication No. WO 90/06370); IL-7 (U.S. Patent No. 4,965,195); 11^8. 
IL-9, IL-10, IL-11, IH2, and IL-13 {Cytokine Bulletin, Summer 1994); IL-14 and 
DL-15; alpha interferon (Pinter et al. (1991) Drngj 42:749-765, U.S. Patent Nos. 
5 4,892,743 and 4,966,843, International Publication No. WO 85/02862, Nagata et aL 
(1980) Nature 284:316-320, Famifletti et al. (1981) Methods in Enz. 78:387-394, Twu 
et al. (1989) Proc. Natl. Acad. ScL USA 86:2046-2050, andFaktor et al, (1990) 
Oncogene 5:867-872); beta-interferon (Seif et al, (1991) /. Virol. 65:664-671); 
gamma-interferons (Radford et al. (1991) The American Society ofHepatology 
10 20082015, Watanabe et al. (1989) Proc. Natl. Acad. ScL USA 86:9456-9460. 

Gansbacher et al. (1990) Cancer Research 50:1S20-7S25, Maio et al. (1989) Can. 
Immunol. Immunother. 30:34-42, and U.S. Patent Nos. 4,762,791 and 4,727,138); G- 
CSF (U,S, Patent Nos. 4.999,291 and 4,810,643); GM-CSF (International Publication 
No. WO 85/04188). 

1 5 Immunomodulatory factors may also be agonists, antagonists, or ligands for 

these molecules. For example, soluble forms of receptors can often behave as 
antagonists for these types of factors, as can mutated fonns of the factors themselves. 

Nucleic acid molecules that encode the above-described substances, as well as 
other nucleic acid molecules that are advantageous for use within the present 

20 invention, may be readily obtained from a variety of sources, including, for example, 
depositories such as the American Type Culture Collection, or from commercial 
sources such as British Bio-Technology Limited (Cowley, Oxford England). 
Representative examples include BBG 12 (containing the GM-CSF gene coding for the 
mature protein of 127 amino acids), BBG 6 (which contains sequences encoding 

25 gamma interferon), A.T.C.C. Deposit No, 39656 (which contains sequences encoding 
TNF), A.T.C.C. Deposit No. 20663 (which contains sequences encoding alpha- 
interferon), A.T.C.C. Deposit Nos. 31902, 31902 and 39517 (which contain sequences 
encoding beta-interferon). A.T.C.C. Deposit No. 67024 (which contains a sequence 
which encodes Interleukin-lb), A.T.CC. Deposit Nos. 39405, 39452, 39516, 39626 

30 and 39673 (which contain sequences encoding Interleukin-2), A.T.C.C. Deposit Nos. 
59399, 59398, and 67326 (which contain sequences encoding Interleukin-3), A.T.CC, 
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Deposit No. 57592 (which contains sequences encoding Interleiikin-4), A.T.C.C. 
Deposit Nos. 59394 and 59395 (which contain sequences encoding Interleukin-5), and 
A.T.C.C. Deposit No. 67153 (which contains sequences encoding InterleuIdn-6). 

Plasmids containing cytokine genes or immunomodulatory genes (International 
5 Publication Nos. WO 94/02951 and WO 96/21015) can be digested with appropriate 
restriction enzymes, and DNA fragments containing the particular gene of interest can 
be inserted into a gene transfer vector using standard molecular biology techniques. 
{See, e.g., Sambrook et al., supra.j or Ausbel et aL (eds) Current Protocols in 
Molecular Biology, Greene Publishing and Wiley-Interscience). 

10 Polynucleotide sequences coding for the above-described molecules can be 

obtained using recombinant methods, such as by screening cDNA and genomic 
libraries from cells expressing the gene, or by deriving the gene from a vector known 
to include the same. For exanq)le, plasmids which contain sequences that encode 
altered cellular products may be obtained from a depository such as the A.T.C.C., or 

15 from commercial sources. Plasmids containing the nucleotide sequences of interest 
can be digested with appropriate restriction enzymes, and DNA fragments containing 
the nucleotide sequences can be inserted into a gene transfer vector using standard 
molecular biology techniques. 

Alternatively, cDNA sequences for use with the present invention may be 

20 obtained from cells which express or contain the sequences, using standard techniques, 
such as phenol extraction and PGR of cDNA or genomic DNA. See, e.g., Sambrook 
et al., supra, for a description of techniques used to obtain and isolate DNA. Briefly, 
mRNA from a cell which expresses the gene of interest can be reverse transcribed with 
reverse transcriptase using oligo-dT or random primers. The single stranded cDNA 

25 may then be amplified by PGR (see U.S. Patent Nos. 4,683,202, 4,683, 195 and 
4,800,159, see also PCR Technology: Principles and Applications for DNA 
Amplification, Erlich (ed.), Stockton Press, 1989)) using oligonucleotide primers 
complementary to sequences on either side of desired sequences. 

The nucleotide sequence of interest can also be produced synthetically, rather 

30 than cloned, using a DNA synthesizer (e.g., an Applied Biosystems Model 392 DNA 
Synthesizer, available from ABI, Foster City, California). The nucleotide sequence can 
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be designed with the appropriate codons for the expression product desired. The 
complete sequence is assembled from overlapping oligonucleotides prepared by 
standard methods and assembled into a complete coding sequence. See, e.g., Edge 
(1981) Nature 292:756; Nambair et al (1984) Science 223: 1299; Jay et al. (1984) /. 
5 BioL ChenL 259:6311. 

The synthetic expression cassettes of the present invention can be employed in 
the construction of packaging cell lines for use y^ith retroviral vectors. 

One type of retrovirus, the murine leukemia virus, or **MLV", has been widely 
utilized for gene therapy applications (see generally Mann et al. (Cell 33:153, 1993), 

10 Cane and Mulligan (Proc, Natl Acad. ScL USA 81:6349, 1984), and Miller et al., 
Hufnan Gene Therapy 1:5-14,1990. 

Lentiviral vectors typically, comprise a 5* lentiviral LTR, a tRNA binding site, a 
packaging signal, a promoter operably linked to one or more genes of interest, an 
origin of second strand DNA synthesis and a 3' lentiviral LTR, wherein the lentiviral 

15 vector contains a nuclear transport element. The nuclear transport element may be 
located either upstream (5') or downstream (3') of a coding sequence of interest (for 
example, a synthetic Gag or Env expression cassette of the present invention). Within 
certain embodiments, the nuclear transport element is not RRE. Within one 
embodiment the packaging signal is an extended packaging signal. Within other 

20 embodiments the promoter is a tissue specific promoter, or, alternatively, a promoter 
such as CMV. Within other embodiments, the lentiviral vector further conyrises an 
internal ribosome entry site. 

A wide variety of lentiviruses may be utilized within the context of the present 
invention, including for example, lentiviruses selected from the group consisting of 

25 fflV, HIV- 1 , HIV-2, FIV and SIV. 

Within yet another aspect of the invention, host cells (e.g., packaging cell lines) 
are provided which contain any of the expression cassettes described herein. For 
example, within one aspect packaging cell line are provided comprising an expression 
cassette that comprises a sequence encoding synthetic Gag-polymerase, and a nuclear 

30 transport element, wherein the promoter is operably linked to the sequence encoding 
Gag-polymerase. Packaging cell lines may further comprise a promoter and a sequence 
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encoding tat, rev, or an envelope, wherein the promoter is operably linked to the 
sequence encoding tat, rev, Env or sequences encoding modified versions of these 
proteins. The packaging cell line may further conprise a sequence encoding any one 
or more of other HIV gene encoding sequences. 
S In one embodiment, the expression cassette (carrying, for exanq>le, the 

synthetic Gag-polymerase) is stably integrated. The packaging cell line, upon 
introduction of a lentivirai vector, typically produces particles. The promoter 
regulating expression of the synthetic expression cassette may be inducible. Typically, 
the packaging cell line, upon introduction of a lentivirai vector, produces particles that 

10 are essentially free of replication conq)etent virus. 
. \ Packaging ceQ lines are provided conq^rising an expression cassette which 

directs the expression of a synthetic Gag-polymerase gene or conq)rising an expression 
cassette which directs the expression of a synthetic Env genes described herein. (See, 
also, Andre, S., et al, Journal of Virology 72(2): 1497-1503, 1998; Haas, J., et al., 

1 5 Current Biology 6(3):3 1 5-324, 1996) for a description of other modified Env 

sequences). A lentivirai vector is introduced into the packaging cell line to produce a 
vector producing cell line. 

As noted above, lentivirai vectors can be designed to carry or express a 
selected gene(s) or sequences of interest. Lentivirai vectors nmy be readily 

20 constructed from a wide variety of lentiviruses (see RNA Tumor Viruses, Second 
Edition, Cold Spring Harbor Laboratory, 1985). Representative examples of 
lentiviruses included HIV, HTV-l, HIV-2, FIV and SIV. Such lentiviruses may either 
be obtained from patient isolates, or, more preferably, from depositories or coDections 
such as the American Type Culture Collection, or isolated from known sources using 

25 available techniques. 

Portions of the lentivirai gene delivery vectors (or vehicles) may be derived 
from different viruses. For example, in a given recombinant lentivirai vector, LTRs 
may be derived from an HIV, a packaging signal from SIV, and an origin of second 
strand synthesis from HrV-2. Lentivirai vector constructs may comprise a 5' lentivirai 

30 LTR, a tRNA binding site, a packaging signal, one or more heterologous sequences. 
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an origin of second strand DNA synthesis and a 3* LTR, wherein said lentiviral vector 
contains a nuclear transport element that is not RRE. 

Briefly, Long Terminal Repeats C*LTRs") are subdivided into three elements, 
designated U5, R and U3. These elements contain a variety of signals which are 
5 responsible for the biological activity of a retrovirus, including for example, promoter 
and enhancer elements which are located within U3. LTRs may be readily identified in 
the pro virus (integrated DNA form) due to their precise duplication at either end of the 
genome. As utilized herein, a 5' LTR should be understood to include a 5' promoter 
element and sufficient LTR sequence to allow reverse transcr^tion and integration of 

10 the DNA form of the vector. The 3* LTR should be understood to include a 

polyadenylation signal, and sufficient LTR sequence to allow reverse transcription and 
integration of the DNA form of the vector. 

The tRNA binding site and origin of second strand DNA synthesis are also 
important for a retrovirus to be biologically active, and may be readily identified by one 

15 of skill in the art. For exdmple, retroviral tRNA binds to a tRNA binding site by 
Watson-Crick base pairing, and is carried with the retrovirus genome into a viral 
particle. The tRNA is then utilized as a primer for DNA synthesis by reverse 
transcriptase. The tRNA binding site may be readily identified based upon its location 
just downstream from the SLTR. Similarly, the origin of second strand DNA synthesis 

20 is, as its name implies, important for the second strand DNA synthesis of a retrovirus. 
This region, which is also referred to as the poly-purine tract, is located just upstream 
of the 3LTR 

In addition to a 5' and 3' LTR, tRNA binding site, and origin of second strand 
DNA synthesis, recombinant retroviral vector constructs may also comprise a 

25 packaging signal, as well as one or more genes or coding sequences of interest. In 
addition, the lentiviral vectors have a nuclear transport element which, in preferred 
embodiments is not RRE. Representative examples of suitable nuclear transport 
elements include the element in Rous sarcoma vims (Ogert, et al., J ViroL 70, 3834- 
3843, 1996). the element in Rous sarcoma virus (Liu & Mertz, Genes &Dev., 9, 1766- 

30 1789, 1995) and the element in the genome of simian retrovirus type I (Zolotukhin, et 
al., / Virol 68, 7944-7952, 1994). Other potential elements include the elements in 
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the histone gene (Kedes, Anm, Rev. Biochem, 48, 837-870, 1970), the a-interferon 
gene (Nagata et al., Nature 287, 401-408, 1980), the P-adrenergic receptor gene 
(Koilka, et al., Nature 329, 75-79, 1987), and the c-Jun gene (Hattorie, et aL, Proc, 
Natl. Acad. Set USA 55, 9148-9152, 1988). 
5 Recombinant lentiviral vector constructs typically lack both Gag-polymerase 

and Env coding sequences. Recombinant lentiviral vector typically contain less than 
20, preferably 15, more preferably 10, and most preferably 8 consecutive nucleotides 
found in Gag-polymerase and Env genes. One advantage of the present invention is 
that the synthetic Gag-polymerase expression cassettes, which can be used to 

1 0 construct packaging cell lines for the recombinant retroviral vector constructs, have 
little homology to wild-type Gag-polynoerase sequences and thus considerably reduce 
or eliminate the possibility of homologous recombination between the synthetic and 
wild-type sequences. 

Lentiviral vectors may also include tissue-specific promoters to drive 

15 expression of one or more genes or sequences of interest. 

Lentiviral vector constructs may be generated such that more than one gene of 
interest is expressed. This may be accomplished through the use of di- or oligo- 
cistronic cassettes (e.g., where the coding regions are separated by 80 nucleotides or 
less, see generally Levin et al., Gene 108:167-174, 1991), or through the use of 

20 Internal Ribosome Entry Sites CTRES"). 

Packaging cell lines suitable for use with the above described recombinant 
retroviral vector constructs may be readily prepared given the disclosure provided 
herein. Briefly, the parent cell line from which the packaging cell line is derived can be 
selected from a variety of mammalian cell lines, including for example, 293, RD, COS- 

25 7, CHO, BHK, VERO, HT1080, and myeloma ceUs. 

After selection of a suitable host cell for the generation of a packaging cell line, 
one or more expression cassettes are introduced into the cell line in order to 
complement or supply in trans conq)onents of the vector which have been deleted. 
Representative exan5)les of suitable synthetic HIV polynucleotide sequences 

30 have been described herein for use in expression cassettes of the present invention. As 
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described above, the native and/or synthetic coding sequences may also be utilized in 
these expression cassettes. 

Utilizing the above-described expression cassettes, a wide variety of packaging 
cell lines can be generated. For example, within one aspect packaging cell line are 
S provided comprising an expression cassette that con^rises a sequence encoding 
synthetic Gag-polymerase, and a nuclear transport element, wherein the promoter is 
operably linked to the sequence encoding Gag-polymerase. Within other aspects, 
packaging cell lines are provided comprising a promoter and a sequence encoding tat, 
rev, Env, or other HIV antigens or epitopes derived therefrom, wherein the promoter 

10 is operably linked to the sequence encodmg tat, rev, Env, or the HIV antigen or 

epitope. Within further embodiments, the packaging cell line may comprise a sequence 
encoding any one or more of tat, rev, nef, vif, vpu or vpr. For exaiqple, the packaging 
ceU line may contain only tat, rev, nef, vif, vpu, or vpr alone, tat rev and nef, nef and 
vif, nef and vpu, nef and vpr, vif and vpu, vif and vpr, vpu and vpr, nef vif and vpu, nef 

15 vif and vpr, nef vpu and vpr, vif vpu and vpr, all four of nef, vif, vpu, and vpr, etc. 

In one embodiment, the expression cassette is stably integrated. Within 
another embodiment, the packaging cell line, upon introduction of a lentiviral vector, 
produces particles. Within further embodiments the promoter is inducible. Within 
certain preferred embodiments of the invention, the packaging cell line, upon 

20 introduction of a lentiviral vector, produces particles that are free of repUcation 
competent virus. 

The synthetic cassettes containing modified coding sequences are transfected 
into a selected cell line. Transfected cells are selected that (i) carry, typically, 
integrated, stable copies of the HIV coding sequences, and (ii) are expressing 

25 acceptable levels of these polypeptides (expression can be evaluated by methods 

known in the prior art in view of the teachings of the present disclosure). The ability 
of the cell line to produce VLPs may also be verified. 

A sequence of interest is constructed into a suitable viral vector as discussed 
above. This defective virus is then transfected into the packaging cell line. The 

30 packaging cell line provides the viral functions necessary for producing virus-like 

particles into which the defective viral genome, containing the sequence of interest, are 
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packaged. These VLPs are then isolated and can be used, for example, in gene 
delivery or gene therapy. 

Further, such packaging cell lines can also be used to produce VLPs alone, 
which can, for exanq}le, be used as adjuvants for administration with other antigens or 
5 in vaccine con^ositions. Also, co-expression of a selected sequence of interest 
encoding a polypeptide (for exan^le, an antigen) in the packaging cell line can also 
result in the entrapment and/or association of the selected polypeptide in/with the 
VLPs. 

Various forms of the different embodiments of the present invention (e.g„ 
1 0 synthetic constructs) may be combined. 

2 AO DNA Immunization AND Gene Delivery 

A variety of HIV polypeptide antigens, particularly HTV antigens, can be used 
in the practice of the present invention. HTV antigens can be included in DNA 

15 immunization constructs containing, for example, a synthetic Env expression cassettes, 
a synthetic Gag expression cassette, a synthetic pol-derived polypeptide expression 
cassette, a synthetic expression cassette comprising sequences encoding one or more 
accessory or regulatory genes (e.g., tat, rev, nef, vif, vpu, vpr), and/or a synthetic Gag 
expression cassette fused in-frame to a coding sequence for the polypeptide antigen 

20 (synthetic or wild-type), where expression of the construct results in VLPs presenting 
the antigen of interest. 

mV antigens of particular interest to be used in the practice of the present 
invention include pol, tat, rev, nef, vif, vpu, vpr, and other HTV-l (also known as 
HTLV-in, LAV, ARV, etc.) antigens or epitopes derived therefrom, including, but not 

25 limited to, antigens such as gpl20, gp41, gpl60 (both native and modified); Gag; and 
pol from a variety of isolates including, but not limited to, HTVun,, HrVsF2, HIV-lspi^, 
HIV-lsFpo, HIVlav. HTVlai, HIVmn, HrV-lcM235n HIV-lus4, other HIV-1 strains from 
diverse subtypes(e.g., subtypes, A through G, and O), HIV-2 strains and diverse 
subtypes (e.g., HIV-2uci and HTV-T^^, See, e.g., Myers, et al., Los Alamos 

30 Database, Los Alamos National Laboratory, Los Alamos, New Mexico; Myers, et al., 
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Human Retroviruses and Aids, 1990, Los Alamos, New Mexico: Los Alamos National 
Laboratory. These antigens may be synthetic (as described herein) or wild-type. 

To evaluate efficacy, DNA immunization using synthetic expression cassettes 
of the present invention can be performed, for exanq>le, as follows. Mice are 
5 immunized with a tat/rev/nef synthetic expression cassette. Other mice are immunized 
with a tat/rev/nef wild type expression cassette. Mouse immunizations with plasmid- 
DNAs typically show that the synthetic expression cassettes provide a clear 
improvement of immunogenicity relative to the native expression cassettes. Also, a 
second boost unmunization will induce a secondary immune response, for example, 

10 after approximately two weeks. Further, the results of CTL assays typically show 
increased potency of synthetic expression cassettes for induction of cytotoxic T- 
lymphocyte (CTL) responses by DNA immunization. 

Exemplary prunate studies directed at the evaluation of neutralizing antibodies 
and cellular immune responses against HIV are described below. 

15 It is readily apparent that the subject invention can be used to mount an 

immune response to a wide variety of antigens and hence to treat or prevent infection, 
particularly HTV infection. 

2.4.1 Delivery of the synthetic expression cassettes of the 

20 present invention 

Polynucleotide sequences coding for the above-described molecules can be 
obtained using recombinant methods, such as by screening cDNA and genomic 
libraries from ceUs expressing the gene, or by deriving the gene from a vector known 
to include the same. Furthermore, the desired gene can be isolated directly from cells 

25 and tissues containing the same, using standard techniques, such as phenol extraction 
and PCR of cDNA or genomic DNA. See, e.g., Sambrook et aL, supra, for a 
description of techniques used to obtain and isolate DNA. The gene of interest can 
also be produced synthetically, rather than cloned. The nucleotide sequence can be 
designed with the appropriate codons for the particular amino acid sequence desired. 

30 In general, one will select preferred codons for the intended host in which the sequence 
wU] be expressed. The complete sequence is assembled from overlapping 
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oligonucleotides prepared by standard methods and asseinbled into a coraplete coding 
sequence. See, e.g., Edge, Nature (1981) 292:756; Nambair et aL, Science (1984) 
223:1299; Jay et al., /. Biol Chenu (1984) 259:631 1; Stenuner, W.P.C., (1995) Gene 
164:49-53. 

5 Next, the gene sequence encoding the desired antigen can be inserted into a 

vector containing a synthetic expression cassette of the present invention. In one 
embodiment, polynucleotides encoding selected antigens are seperately cloned into 
expression vectors (e.g., Env-coding polynucleotide in a &st vector, Gag-coding 
polynucleotide in a second vector, Pol-derived polypeptide-coding polynucleotide in a 

10 third vector, tat-, rev-, nef-, vif-, vpu-, vpr-coding polynucleotides in further vectors, 
etc.). In certain embodiments, the antigen is inserted into or adjacent a synthetic Gag 
coding sequence such that when the combined sequence is expressed it results in the 
production of VLPs comprising the Gag polypeptide and the antigen of interest, e.g., 
Env (native or modified) or other antigen(s) (native or modified) derived from HIV, 

15 Insertions can be made within the coding sequence or at either end of the coding 
sequence (5', amino terminus of the expressed Gag polypeptide; or 3', carboxy 
terminus of the expressed Gag polypeptide)(Wagner, R., et al., Arch Virol 127:117- 
137, 1992; Wagner, R., et al., Virology 200:162-175, 1994; Wu, X., et al., /. Virol 
69(6):3389-3398, 1995; Wang, C-T., et al., Virology 200:524-534, 1994; Chazal, N., 

20 et al, Virology 68(1): 11 1-122, 1994; Griffiths, J.C., et al., /. Virol 67(6):3191-3198. 
1993; Reicin, A.S., et al., /. Virol 69(2):642-650. 1995). 

Up to 50% of the coding sequences of p55Gag can be deleted without 
affecting the assembly to virus-like particles and expression efficiency (Borsetti, A., et 
al, /. Virol 72(11):93 13-93 17, 1998; Gamier, L., et al., / Virol 72(6):4667-4677, 

25 1998; Zhang, Y., et al., J Virol 72(3): 1782-1789. 1998; Wang, C, et aL, J Virol 
72(10): 7950-7959, 1998). In one embodiment of the present invention, 
immunogenicity of the high level expressing synthetic Gag expression cassettes can be 
increased by the insertion of different structural or non-structural HIV antigens, 
multiepitope cassettes, or cytokine sequences into deleted regions of Gag sequence. 

30 Such deletions may be generated following the teachings of the present invention and 
information available to one of ordinary skill in the art. One possible advantage of this 
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approach, relative to using full-length sequences fused to heterologous polypeptides, 
can be higher expression/secretion efficiency of the expression product. 

When sequences are added to the amino terminal end of Gag, the polynucletide 
can contain coding sequences at the S' end that encode a signal for addition of a 
S myristic moiety to the Gag-^ntaining polypeptide (e.g., sequences that encode Met- 
Gly). 

The ability of Gag-containing polypeptide constructs to form VLPs can be 
empirically determined following the teachings of the present specification. 

The synthetic expression cassettes can also include control elements operably 

10 linked to the coding sequence, which allow for the expression of the gene in vivo in the 
subject species. For example, typical promoters for mammalian ceU expression include 
the S V40 early promoter, a CMV promoter such as the CMV immediate early 
promoter, the mouse mammary tumor virus LTR promoter, the adenovirus major late 
promoter (Ad MLP), and the herpes simplex virus promoter, among others. Other 

15 nonviral promoters, such as a promoter derived from the murine metallothionein gene, 
will also find use for mammalian expression. Typically, transcription termination and 
polyadenylation sequences will also be present, located 3' to the translation stop 
codon. Preferably, a sequence for optimization of initiation of translation, located 5* 
to the coding sequence, is also present. Examples of transcription 

20 terminator/polyadenylation signals include those derived from S V40, as described in 
Sambrook et al., supra, as well as a bovine growth hormone terminator sequence. 

Enhancer elements may also be used herein to increase expression levels of the 
mammalian constructs. Examples include the S V40 early gene enhancer, as described 
in Dijkema et al., EMBO J. (1985) 4:761, the enhancer/promoter derived from the 

25 long terminal repeat (LTR) of the Rous Sarcoma Virus, as described in Gorman et al., 
Proc. Natl. Acad. Sci. USA (1982b) 72:6777 and elements derived from human CMV, 
as described in Boshart et al., Cell (1985) 41:521, such as elements included in the 
CMV intron A sequence. 

Furthermore, plasmids can be constructed which include a chimeric antigen- 

30 coding gene sequences, encoding, e.g.. multiple antigens/epitopes of interest, for 
example derived from more than one viral isolate. 
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Typically the antigen coding sequences precede or follow the synthetic coding 
sequence and the chinoieric transcription unit will have a single open reading frame 
encoding both the antigen of interest and the synthetic coding sequences. 
Alternatively, multi-cistronic cassettes (e.g., bi-cistronic cassettes) can be constructed 
S allowing expression of multiple antigens from a single mRNA using the EMCV IRES, 
or the like (Example 7). 

In one embodiment of the present invention, a nucleic acid immunizing 
composition may comprise, for example, the following: a &st expression vector 
comprising a Gag expression cassette, a second vector comprising an Env expression 

10 cassette, and a third expression vector con^rising a Pol expression cassette, or one or 
more coding region of Pol (e.g., Prot, RT, RNase, Int), wherein further antigen coding 
sequences may be associated with the Pol expression, such antigens may be obtained, 
for example, from accessory genes (e.g., vpr, vpu, vif), regulatory genes (e.g., nef, tat, 
rev), or portions of the Pol sequences (e.g., Prot, RT, RNase, Int)). In another 

1 5 embodiment, a nucleic acid immunizing composition may comprise, for example, an 
expression cassette comprising any of the synthetic polynucleotide sequences of the 
present invention. In another embodiment, a nucleic acid immunizing composition may 
comprise, for example, an expression cassette con:5)rising coding sequences for a 
number of HIV genes (or sequences derived from such genes) wherein the coding 

20 sequences are in-frame and under the control of a single promoter, for example, Gag- 
Env constructs, Tat-Rev-Nef constructs, P2Pol-tat-rev-nef constructs, etc. The 
synthetic coding sequences of the present invention may be combined in any number of 
combinations depending on the coding sequence products (i.e., HIV polypeptides) to 
which, for example, an immunological response is desired to be raised. In yet another 

25 embodiment, synthetic coding sequences for mulitple HTV-derived polypeptides may 
be constructed into a polycistronic message under the control of a single promoter 
wherein IRES are placed adjacent the codmg sequence for each encoded polypeptide. 

Once complete, the constructs are used for nucleic acid immunization using 
standard gene delivery protocols. Methods for gene delivery are known in the art. 

30 See, e.g., U.S. Patent Nos. 5,399,346, 5,580,859, 5,589,466. Genes can be deUvered 



78 



wo 03/004620 



PCT/US02/21420 



either directly to the vertebrate subject or, alternatively, delivered ex vivo, to cells 
derived from the subject and the cells reimplanted in the subject, 

A number of viral based systems have been developed for gene transfer into 
mammalian cells. For example, retroviruses provide a convenient platform for gene 
5 delivery systems. Selected sequences can be inserted into a vector and packaged in 
retroviral particles using techniques known in the art. The recombinant virus can then 
be isolated and delivered to cells of the subject either in vivo or ex vivo. A number of 
retroviral systems have been described (U.S. Patent No. 5,219,740; Miller and 
Rosman, BioTechniques (1989) 7:980-990; Miller, A.D., Human Gene Therapy 

10 (1990) 1:5-14; Scarpa et al.. Virology (1991) 180:849-852; Burns et al„ Proc, Natl 
Acad, ScL USA (1993) 90:8033-8037; and Boris-Lawrie andTemin, Cur, Opin. 
Genet. Develop, ( 1 993) 3 : 1 02- 1 09. 

A number of adenovirus vectors have also been described. Unlike retroviruses 
which integrate into the host genome, adenoviruses persist extrachromosomally thus 

15 minimizing the risks associated with insertional mutagenesis (Haj- Ahmad and Graham, 
7. ViroL (1986) 57:267-274; Rett et al., /. ViroL (1993) 67:5911-5921; Mittereder et 
al., Human Gene Therapy (1994) 5:717-729; Seth et al., J. ViroL (1994) 68:933-940; 
Barr et al.. Gene Therapy (1994) 1:51-58; Berkner, K.L. BioTechniques (1988) 6:616- 
629; and Rich et al., Human Gene Therapy (1993) 4:461-476). 

20 Additionally, various adeno-associated virus (AAV) vector systems have been 

developed for gene delivery, AAV vectors can be readily constructed using techniques 
well known in the art. See, e.g., U.S. Patent Nos. 5,173,414 and 5,139,941; 
International Publication Nos. WO 92/01070 (published 23 January 1992) and WO 
93/03769 (published 4 March 1993); Lebkowski et al., Molec, Cell, Biol (1988) 

25 8:3988-3996; Vincent et aL, Vaccines 90 (1990) (Cold Spring Harbor Laboratory 
Press); Carter, B.J. Current Opinion in Biotechnology (1992) 3:533-539; Muzyczka, 
N. Current Topics in Microbiol and Immunol (1992) 158:97-129; Kotm, R.M. 
Human Gene Therapy (1994) 5:793-801; Shelling and Smith, Gene Therapy (1994) 
i: 165-169; and Zhou et al., J. Exp. Med (1994) 179:1867-1875. 
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Another vector system useful for delivering the polynucleotides of the present 
invention is the enterically administered recombinant poxvirus vaccines described by 
Small, Jr., P.A., et al, (U.S. Patent No. 5,676,950, issued October 14, 1997). 

Additional viral vectors which will find use for delivering the nucleic acid 
5 molecules encoding the antigens of interest include those derived from the pox family 
of viruses, including vaccinia virus and avian poxvirus. By way of example, vaccinia 
virus recombinants expressing the genes can be constructed as follows. The DNA 
encoding the particular synthetic HIV polypeptide coding sequence is first inserted into 
an appropriate vector so that it is adjacent to a vaccinia promoter and flanking vaccinia 
10 DNA sequences, such as the sequence encoding thymidine kinase (TK). This vector is 
then used to transfect ceDs which are simultaneously infected with vaccinia. 
Homologous recombination serves to insert the vaccinia promoter plus the gene 
encoding the coding sequences of interest into the viral genome. The resulting TK" 
recombinant can be selected by culturing the cells in the presence of 5- 
15 bromodeoxyuridine and picking viral plaques resistant thereto. 

Alternatively, avipoxviruses, such as the fowlpox and canarypox viruses, can 
also be used to deliver the genes. Recombinant avipox viruses, expressing 
immunogens from mammalian pathogens, are known to confer protective immunity 
when administered to non-avian species. The use of an avipox vector is particularly 
20 desirable in human and other mammalian species since members of the avipox genus 
can only productively replicate in susceptible avian species and therefore are not 
infective in mammaUan cells. Methods for producing recombinant avipoxviruses are 
known in the art and en[q)loy genetic recombination, as described above with respect to 
the production of vaccinia viruses. See, e.g., WO 91/12882; WO 89/03429; and WO 
25 92/03545. 

Molecular conjugate vectors, such as the adenovirus chimeric vectors described 
in Michael et al., J. Biol Chem. (1993) 2g|:6866-6869 and Wagner et al., Proc. Natl 
Acad. Set USA (1992) 82:6099-6103, can also be used for gene delivery. 

Members of the Alphavirus genus, such as, but not limited to, vectors derived 
30 from the Sindbis, Senaliki Forest, and Venezuelan Equine Encephalitis viruses, will also 
find use as viral vectors for delivering the polynucleotides of the present invention (for 
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example, a synthetic Gag-polypeptide encoding expression cassette). For a description 
of Sindbis-viras derived vectors useful for the practice of the instant methods, see, 
Dubensky et al., /. Virol. (1996) 70:508-519; and International PubMcation Nos. WO 
95/07995 and WO 96/17072; as well as, Dubensky. Jr., T.W., et al., U.S. Patent No. 
5 5,843.723. issued December 1, 1998, and Dubensky, Jr., T.W., U.S. Patent No. 
5,789,245. issued August 4, 1998. Preferred expression systems include, but are not 
limited to, eucaryotic layered vector initiation systems (e.g., US Patent No. 6,015,686, 
US Patent No. 5, 814,482, US Patent No. 6,015,694, US Patent No. 5,789,245, EP 
1029068A2, WO 9918226A2/A3, EP 00907746A2, WO 9738087A2). 

10 A vaccinia based infection/transfection system can be conveniently used to 

provide for inducible, transient expression of the codmg sequences of interest in a host 
cell. In this system, cells are first infected in vitro with a vaccinia virus recombinant 
that encodes the bacteriophage T7 RNA polymerase. This polymerase displays 
exquisite specificity in that it only transcribes templates bearing T7 promoters. 

15 Following infection, cells are transfected with the polynucleotide of interest, driven by 
a T7 promoter. The polymerase expressed in the cytoplasm from the vaccinia virus 
recombinant transcribes the transfected DNA into RNA which is then translated into 
protein by the host translational machinery. The method provides for high level, 
transient, cytoplasmic production of large quantities of RNA and its translation 

20 products. See. e.g., Elroy-Stein and Moss, Proc. Natl Acad. Sci. USA (1990) 
87:6743-6747; Fuerst et al., Proc, Natl Acad. ScL USA (1986) 83:8122-8126. 

As an alternative approach to infection with vaccinia or avipox virus 
recombinants, or to the delivery of genes using other viral vectors, an amplification 
system can be used that will lead to high level expression following introduction into 

25 host cells. Specifically, a T7 RNA polymerase promoter preceding the coding region 
for T7 RNA polymerase can be engineered. Translation of RNA derived firom this 
'i template will generate T7 RNA polymerase which in turn will transcribe more 

teiiq)late. Concomitantly, there will be a cDNA whose expression is under the control 
of the T7 promoter. Thus, some of the T7 RNA polymerase generated from 

30 translation of the amplification template RNA will lead to transcription of the desired 
gene. Because some T7 RNA polymerase is required to initiate the amplification, T7 
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RN A polymerase can be introduced into cells along with the template(s) to prime the 
transcription reaction. The polymerase can be introduced as a protein or on a plasraid 
encoding the RNA polymerase. For a farther discussion of T7 systems and their use 
for transforming cells, see, e.g., International Publication No. WO 94/26911; Studier 
5 and Mofiatt, 7, MoL Biol. (1986) 182: 1 13-130; Deng and Wolff, Gene (1994) 

143:245-249; Gao et aL. Biochem. Biophys. Res. Commim. (1994) 20Q: 1201-1206; 
Gao and Huang, Nuc. Acids Res. (1993) 21 :2867-2872; Chen et al., Nuc. Acids Res. 
(1994) 22:21 14-2120; and U.S. Patent No. 5,135,855. 

Delivery of the expression cassettes of the present invention can also be 

1 0 accomplished using eucaryotic expression vectors comprising CMV-derived elements, 
such vectors include, but are not limited to, the following: pCMVKm2, pCMV-link 
pCMVPLEdhfr, and pCMV6a (all described above). 

Synthetic expression cassettes of interest can also be delivered without a viral 
vector. For example, the synthetic expression cassette can be packaged in liposomes 

15 prior to delivery to the subject or to cells derived therefrom. Lipid encapsulation is 
generally accomplished using liposomes which are able to stably bind or entrap and 
retain nucleic acid. The ratio of condensed DNA to lipid preparation can vary but will 
generally be around 1:1 (mg DNA:micromoles lipid), or more of lipid. For a review of 
the use of Uposomes as carriers for delivery of nucleic acids, see, Hug and Sleight, 

20 Biochim. Biophys. Acta, (1991) 1097 : 1-17; Straubinger et al., in Methods of 
En^mology (1983). Vol. 101. pp. 512-527. 

Liposomal preparations for use in the present invention include cationic 
(positively charged), anionic (negatively charged) and neutral preparations, with 
cationic liposomes particularly preferred. Cationic liposomes have been shown to 

25 mediate intracellular delivery of plasmid DNA (Feigner et al., Proc. Natl. Acad ScL 

USA (1987) 84:7413-7416); mRNA (Malone et al., Proc, Natl Acad. ScL USA (1989) 
86:6077-6081); and purified transcription factors (Debs et al., /. Biol. Chem. (1990) 
265: 10189-10192), in functional form. 

Cationic liposomes are readily available. For exan5)le, N[l-2,3- 

30 dioleyloxy)propyl]-N,N,N-triethylammonium (DOTMA) liposomes are available under 
the trademark Lipofectin, from GIBCO BRL, Grand Island, NY. (See, also, Feigner et 
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al., Proc, Natl Acad. Set USA (1987) 84:7413-7416). Other commercially available 
lipids include (DDAB/DOPE) and DOTAP/DOPE (Boerhinger). Other cationic 
liposomes can be prepared from readily available materials using techniques well 
known in the art. See, e.g.. Szoka et aL, Proc. Natl. Acad. Set. USA (1978) 25:4194- 
5 4198; PCX Publication No. WO 90/1 1092 for a description of the synthesis of DOTAP 
( 1 ,2-bis(oleoyloxy)-3-(trimethylanmionio)propane) liposomes. 

Similarly, anionic and neutral liposomes are readily available, such as, from 
Avanti Polar Lipids (Birmingham, AL), or can be easily prepared using readily 
available materials. Such materials include phosphatidyl choline, cholesterol, 

10 phosphatidyl ethanolamine, dioleoylphosphatidyl choline (DOPC), 

dioleoylphosphatidyl glycerol (DOPG), dioleoylphoshatidyl ethanolamine (DOPE), 
among others. These materials can also be mixed with the DOTMA and DOTAP 
starting materials in appropriate ratios. Methods for making liposomes usmg these 
materials are well known in the art. 

15 The liposomes can comprise multilammelar vesicles (MLVs), small unilamellar 

vesicles (SUVs), or large unilamellar vesicles (LUVs). The various liposome-nucleic 
acid complexes are prepared using methods known in the art. See, e.g., Straubinger et 
al., in METHODS OF IMMUNOLOGY (1983), Vol. 101, pp. 512-527; Szoka et al., 
Proc, Natl Acad. ScL USA (1978) 75:4194-4198; Papahadjopoulos et al., Biochim. 

20 Biophys, Acta (1975) 394:483; Wilson et al., Cell (1979) 17:77); Deamer and 

Bangham, Biochim, Biophys. Acta (1976) 443:629; Ostro et al., Biochem, Biophys., 
Res. Commun. (1977) 76:836; Fraley et al., Proc. Natl Acad. ScL USA (1979) 
76:3348); Enoch and Strittmatter, Proc. Natl Acad. ScL USA (1979) 76:145); Fraley 
et al, J. Biol Chem. (1980) 255:10431; Szoka and Papahadjopoulos, Proc. Natl 

25 Acad ScL USA (1978) 75: 145; and Schaefer-Ridder et al.. Science (1982) 215: 166. 

The DNA and/or protein antigen(s) can also be delivered in cochleate lipid 
con:q)Ositions shnilar to those described by Papahadjopoulos et aL, Biochem, Biophys. 
Acta. (1975) 324:483-491. See, also, U.S. Patent Nos. 4,663,161 and 4,871,488. 
The synthetic expression cassette of interest may also be encapsulated, 

30 adsorbed to, or associated with, particulate carriers. Such carriers present multiple 
copies of a selected antigen to the immune system and promote trapping and retention 
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of antigens in local lymph nodes. The particles can be phagocytosed by macrophages 
and can enhance antigen presentation through cytokine release. Bxamples of 
particulate carriers include those derived frompolymethyl methacrylate polymers, as 
well as microparticles derived from poly(lactides) and poly(lactide-co-glycolides), 
5 known as PLC. See. e.g., Jefifery et al., Pham. Res. (1993) lQ:362-368; McGee JP, 
et al., J MicroencapsuL 14(2): 197-210, 1997; OHagan DT, et al.. Vaccine 11(2): 149- 
54, 1993. Suitable nfiicroparticles may also be manufactured in the presence of 
charged detergents, such as anionic or cationic detergents, to yield microparticles with 
a surface having a net negative or a net positive charge. For example, microparticles 

10 manufactured with anionic detergents, such as hexadecyltrimethylammonium brondde 
(CTAB), i.e. CTAB-PLG microparticles, adsorb negatively charged macromolecules, 
such as DNA. (see, e.g., Int'l Application Number PCT/US99/17308). 

Furthermore, other particulate systems and polymers can be used for the in 
vivo or ex vivo delivery of the gene of interest. For example, polymers such as 

15 polylysine, polyarginine, polyomithine, spermine, spermidine, as well as conjugates of 
these molecules, are useful for transferring a nucleic acid of interest. Similarly, DEAE 
dextran-mediated transfection, calcium phosphate precipitation or precipitation using 
other insoluble inorganic salts, such as strontium phosphate, aluminum silicates 
including bentonite and kaolin, chromic oxide, magnesium silicate, talc, and the like, 

20 will find use with the present methods. See, e.g., Feigner, P.L., Advanced Drug 

Delivery^ Reviews (1990) 5: 163-187, for a review of delivery systems useful for gene 
transfer. Peptoids (Zuckerman, R.N., et al., U.S. Patent No. 5,831,005, issued 
November 3, 1998) naay also be used for delivery of a construct of the present 
invention. 

25 Additionally, biolistic delivery systems employing particulate carriers such as 

gold and tungsten, are especially useful for delivering synthetic expression cassettes of 
the present invention. The particles are coated with the synthetic expression 
cassette(s) to be delivered and accelerated to high velocity, generally under a reduced 
atmosphere, using a gun powder discharge from a "gene gun." For a description of 

30 such techniques, and apparatuses useful therefore, see, e.g., U.S. Patent Nos. 

4.945,050; 5,036,006; 5,100,792; 5,179.022; 5,371,015; and 5,478,744. Also, needle- 
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less injection systems can be used (Davis, H.L., et al, Vaccine 12:1503-1509, 1994; 
Bioject, Inc., Portland, OR). 

Recombinant vectors carrying a synthetic expression cassette of the present 
invention are formulated into con^ositions for delivery to the vertebrate subject. 
5 These compositions may either be prophylactic (to prevent infection) or therapeutic (to 
treat disease after infection). The compositions will conq)rise a 'therapeutically 
effective amount" of the gene of interest such that an amount of the antigen can be 
produced in vivo so that an immune response is generated in the individual to which it 
is administered. The exact amount necessary will vary depending on the subject being 

1 0 treated; the age and general condition of the subject to be treated; the capacity of the 
subject's immune system to synthesize antibodies; the degree of protection desired; the 
severity of the condition being treated; the particular antigen selected and its mode of 
adn±iistration, among other factors. An appropriate effective amoimt can be readily 
determined by one of skill in the art. Thus, a 'therapeutically effective amount" will 

15 fall in a relatively broad range that can be determined through routine trials. 

The compositions will generally include one or more ^'pharmaceutically 
acceptable excipients or vehicles" such as water, saline, glycerol, polyethyleneglycol, 
hyaluronic acid, ethanol, etc. AdditionaDy, auxiliary substances, such as wetting or 
emulsilying agents, pH buffering substances, and the like, may be present in such 

20 vehicles. Certain facilitators of nucleic acid uptake and/or expression can also be 
included in the compositions or coadministered, such as, but not limited to, 
bupivacaine, cardiotoxin and sucrose. 

Once formulated, the compositions of the ravention can be administered 
directly to the subject (e.g., as described above) or, alternatively, delivered ex vivo, to 

25 cells derived from the subject, using methods such as those described above. For 

example, methods for the ex vivo delivery and reimplantation of transformed cells into 
a subject are known in the art and can include, e.g., dextran-mediated transfection, 
calcium phosphate precipitation, polybrene mediated transfection, lipofectamine and 
LT-1 mediated transfection, protoplast fusion, electroporation, encapsulation of the 

30 polynucleotide(s) (with or without the corresponding antigen) in liposomes, and direct 
microinjection of the DNA into nuclei. 
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Direct delivery of synthetic expression cassette con:5)ositions in vivo will 
generally be accomplished with or without viral vectors, as described above, by 
injection using either a conventional syringe or a gene gun, such as the Accell® gene 
delivery system (PowderJect Technologies, Inc., Oxford, England). The constructs 
5 can be injected either subcutaneously, epidermally, intradermally, intramucosally such 
as nasally, rectally and vaginally, intraperitoneally, intravenously, orally or 
intramuscularly. Delivery of DNA into cells of the epidermis is particularly preferred 
as this mode of administration provides access to skin-associated lymphoid cells and 
provides for a transient presence of DNA in the recipient. Other modes of 

10 administration include oral and puhnonary administration, suppositories, needle-less 
injection, transcutaneous and transdermal applications. Dosage treatment may be a 
single dose schedule or a multiple dose schedule. Administration of nucleic acids may 
also be combined with administration of peptides or other substances. 

Exemplary immunogenicity studies are presented in Examples 4, 5, 6, 9, 10, 

15 11, and 12. 

2.4.2 Ex vrvo Delivery of the synthetic expression cassettes of 

THE present invention 
In one embodiment, T cells, and related cell types (including but not limited to 

20 antigen presenting cells, such as, macrophage, monocytes, lymphoid cells, dendritic 
cells, B-cells, T-cells, stem cells, and progenitor cells thereof), can be used for ex vivo 
delivery of the synthetic expression cassettes of the present invention. T cells can be 
isolated from peripheral blood lymphocytes (PBLs) by a variety of procedures known 
to those skilled in the art. For example, T cell populations can be "enriched" from a 

25 population of PBLs through the renaoval of accessory and B cells. In particular, T cell 
enrichment can be accomplished by the elimdnation of non-T cells using anti-MHC 
class n monoclonal antibodies. Similarly, other antibodies can be used to deplete 
specific populations of non-T cells. For exant^le, anti-Ig antibody molecules can be 
used to deplete B cells and anti-MacI antibody molecules can be used to deplete 

30 macrophages. 
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T cells can be further fractionated into a number of different subpopulations by 
techniques known to those skilled in the art. Two major subpopulations can be 
isolated based on their differential expression of the cell surface markers CD4 and 
CDS, For example, following the enrichment of T cells as described above, CD4* cells 
S can be enriched using antibodies specific for CD4 (see Coligan et al., supra). The 
antibodies may be coupled to a solid support such as magnetic beads. Conversely, 
CD8+ cells can be enriched through the use of antibodies specific for CD4 (to remove 
CD4^ cells), or can be isolated by the use of CDS antibodies coupled to a solid 
support. CD4 lymphocytes from HTV-l infected patients can be expanded ex v/vo, 
10 before or after transduction as described by Wilson et. al. (1995) 7. Infect Dis. 
172:88. 

Following purification of T cells, a variety of methods of genetic modification 
known to those skilled in the art can be performed using non-viral or viral-based gene 
transfer vectors constructed as described herein. For example, one such approach 

15 involves transduction of the purified T cell population with vector-containing 

supernatant of cultures derived from vector producing cells. A second approach 
involves co-cultivation of an irradiated monolayer of vector-producing cells with the 
purified T cells. A third approach involves a similar co-cultivation approach; however, 
the purified T cells are pre-stimulated with various cytokines and cultured 48 hours 

20 prior to the co-cultivation with the irradiated vector producing cells. Pre-stimulation 
prior to such transduction increases effective gene transfer (Nolta et al. (1992) Exp. 
HematoL 20:1065), Stimulation of these cultures to proliferate also provides 
increased cell populations for re-infusion into the patient. Subsequent to co- 
cultivation, T cells are collected from the vector producing cell monolayer, expanded, 

25 and frozen in Uquid nitrogen. 

Gene transfer vectors, containing one or more synthetic expression cassette of 
the present invention (associated with appropriate control elements for delivery to the 
isolated T cells) can be assembled using known methods and following the guidance of 
the present specification. 

30 Selectable markers can also be used in the construction of gene transfer 

vectors. For example, a marker can be used which inapaits to a mammalian cell 
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transduced with the gene transfer vector resistance to a cytotoxic agent. The cytotoxic 
agent can be, but is not limited to, neomycin, aminoglycoside, tetracycline, 
chloramphenicol, sulfonamide, actinomycin, netropsin, distamycin A, anthracycline, or 
pyrazinamide. For example, neomycin phosphotransferase n iniparts resistance to the 
5 neomycin analogue geneticin (G4 1 8). 

The T cells can also be maintained in a medium containing at least one type of 
growth factor prior to bemg selected. A variety of growth factors are known in the art 
which sustain the growth of a particular cell type. Examples of such growth factors 
are cytokine mitogens such as rIL-2, IL-10, IL-12, and IL-15, which promote growth 

1 0 and activation of lymphocytes. Certain types of cells are stimulated by other growth 
factors such as hormones, inchiding human chorionic gonadotropin (hCG) and human 
growth hormone. The selection of an appropriate growth factor for a particular cell 
population is readily accomplished by one of skill in the art. 

For example, white blood cells such as differentiated progenitor and stem cells 

15 are stimulated by a variety of growth factors. More particularly, IL-3, IL-4, IL-5, IL- 
6, IL-9, GM-CSF, M-CSF, and G-CSF, produced by activated Th and activated 
macrophages, stimulate myeloid stem cells, which then differentiate into pluripotent 
stem cells, granulocyte-monocyte progenitors, eosinophil progenitors, basophil 
progenitors, megakaryocytes, and erythroid progenitors. Differentiation is modulated 

20 by growth factors such as GM-CSF, IL-3, IL-6, DL-ll, and EPO. 

Pluripotent stem cells then differentiate into lymphoid stem cells, bone marrow 
stromal cells, T cell progenitors, B cell progenitors, thymocytes, Th Cells, T^ cells, and 
B cells. This differentiation is modulated by growth factors such as JLrS, IL-4, IL-6, 
IL-7, GM-CSF, M-CSF, G-CSF, IL-2, and 11^5. 

25 Granulocyte-monocyte progenitors differentiate to monocytes, macrophages, 

and neutrophils. Such differentiation is modulated by the growth factors GM-CSF, M- 
CSF, and IL-8. Eosinophil progenitors differentiate into eosinophils. This process is 
modulated by GM-CSF and IL-5. 

The differentiation of basophil progenitors into mast cells and basophils is 

30 modulated by GM-CSF, IL-4, and IL-9. Megakaryocytes produce platelets in 
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response to GM-CSF, EPO, and IL-6. Erythroid progenitor cells differentiate into red 
blood cells in response to EPO. 

Thus, during activation by the CD3-binding agent, T ceDs can also be 
contacted with a mitogen, for exan[q)le a cytokine such as ILr2. In particularly 
5 preferred embodiments, the IL-2 is added to the population of T cells at a 

concentration of about 50 to 100 jig/ml. Activation with the CD3-binding agent can 
be carried out for 2 to 4 days. 

Once suitably activated, the T cells are genetically modified by contacting the 
same with a suitable gene transfer vector under conditions that allow for transfection 

10 of the vectors into the T cells. Genetic modification is carried out when the cell 
density of the T cell population is between about 0.1 x 10* and 5 x 10*, preferably 
between about 0.5 x 10* and 2 x 10*. A number of suitable viral and nonviral-based 
gene transfer vectors have been described for use herein. 

After transduction, transduced cells are selected away from non-transduced 

1 5 ceUs using known techniques. For example, if the gene transfer vector used in the 
transduction includes a selectable marker which confers resistance to a cytotoxic 
agent, the cells can be contacted with the appropriate cytotoxic agent, whereby non- 
transduced cells can be negatively selected away from the transduced cells. If the 
selectable marker is a cell surface marker, the cells can be contacted with a binding 

20 agent specific for the particular cell surface marker, whereby the transduced cells can 
be positively selected away from the population. The selection step can also entail 
fluorescence-activated cell sorting (FACS) techniques, such as where FAGS is used to 
select cells from the population containing a particular surface marker, or the selection 
step can entail the use of magnetically responsive particles as retrievable supports for 

25 target cell capture and/or background removal 

More particularly, positive selection of the transduced cells can be performed 
using a FACS cell sorter (e.g. a FACS Vantage™ Cell Sorter, Becton Dickinson 
Immunocytometry Systems, San Jose, CA) to sort and collect transduced cells 
expressing a selectable cell surface marker. Following transduction, the cells are 

30 stained with fluorescent-labeled antibody molecules directed against the particular ceU 
surface marker. The amount of bound antibody on each cell can be measured by 
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passing droplets containing the cells through the cell sorter. By imparting an 
electromagnetic charge to droplets containing the stained cells, the transduced cells 
can be separated from other cells. The positively selected cells are then harvested in 
sterile collection vessels. These cell sorting procedures are described in detail, for 
5 example, m the FACSVantage™ Training Manual, with particular reference to 
sections 3-11 to 3-28 and 10-1 to 10-17. 

Positive selection of the transduced cells can also be performed using magnetic 
separation of cells based on expression or a particular ceU surface marker. In such 
separation techniques, ceUs to be positively selected are first contacted with specific 

10 binding agent (e.g., an antibody or reagent the interacts specifically with the cell 
surface marker). The cells are then contacted with retrievable particles (e.g., 
magnetically responsive particles) which are coupled with a reagent that binds the 
specific binding agent (that has bound to the positive ceDs). The cell-binding agent- 
particle complex can then be physically separated fl^om non-labeled cells, for example 

15 using a magnetic field. When using magnetically responsive particles, the labeled cells 
can be retained in a container using a magnetic filed while the negative cells are 
removed. These and similar separation procedures are known to those of ordinary skill 
in the art. 

Expression of the vector in the selected transduced cells can be assessed by a 
20 number of assays known to those skilled in the art. For example, Western blot or 

Northern analysis can be employed depending on the nature of the inserted nucleotide 
sequence of interest. Once expression has been established and the transformed T cells 
have been tested for the presence of the selected synthetic expression cassette, they are 
ready for infusion into a patient via the peripheral blood streanL 
25 The invention inchides a kit for genetic modification of an ex vivo population of 

primary mammalian cells. The kit typically contains a gene transfer vector coding for 
at least one selectable marker and at least one synthetic expression cassette contained 
in one or more containers, ancillary reagents or hardware, and instructions for use of 
the kit. 
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2.4.3 Further Delivery REGIMES 

Any of the polynucleotides (eg., expression cassettes) or polypeptides 
described herein (delivered by any of the methods described above) can also be used in 
combination with other DNA delivery systems and/or protein delivery systems. Non- 
5 lintiiting exanq)les include co-administration of these molecules, for exmnph, in prime- 
boost methods where one or more molecules are delivered in a **priming" step and, 
subsequently, one or more molecules are delivered in a "boosting" step. In certain 
embodiments, the delivery of one or more nucleic acid-containing compositions and is 
followed by delivery of one or more nucleic acid-containing compositions and/or one 

10 or more polypeptide-containing compositions (eg., polypeptides comprising HIV 
antigens). In other embodiments, multiple nucleic acid **primes" (of the same or 
different nucleic acid molecules) can be followed by multiple polypeptide '^boosts*' (of 
the same or different polypeptides). Other examples include multiple nucleic acid 
administrations and multiple polypeptide administrations. 

15 In any method involving co-administration, the various compositions can be 

delivered in any order. Thus, in embodiments including delivery of multiple different 
compositions or molecules, the nucleic acids need not be aU delivered before the 
polypeptides. For example, the priming step may include delivery of one or more 
polypeptides and the boosting comprises delivery of one or more nucleic acids and/or 

20 one more polypeptides. Multiple polypeptide administrations can be followed by 
multiple nucleic acid administrations or polypeptide and nucleic acid administrations 
can be performed in any order. In any of the embodiments described herein, the 
nucleic acid molecules can encode all, some or none of the polypeptides. Thus, one or 
more or the nucleic acid molecules (e.g., expression cassettes) described herein and/or 

25 one or more of the polypeptides described herein can be co-administered in any order 
and via any administration routes. Therefore, any combination of polynucleotides 
and/or polypeptides described herein can be used to generate elicit an immune 
reaction. 

30 
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3,0 Improved HIV-1 Gag and Pol expression cassettes 
WhUe not desiring to be bound by any particular model, theory, or hypothesis, 
the following information is presented to provide a more complete understandmg of 
the present invention. 

5 The world health organization (WHO) estimated the number of people 

worldwide that are infected with HIV-1 to exceed 36.1 million. The development of a 
safe and effective HTV vaccine is therefore essential at this time. Recent studies have 
demonstrated the importance of CTL in controlling the HIV-1 replication in infected 
patients. Furthermore, CTL reactivity with multiple HTV antigens will be necessary for 

10 the effective control of virus replication. Experiments performed in support of the 

present invention suggest that the inclusion of HIV-1 Gag and Pol, beside Env for the 
induction of neutralizing antibodies, into the vaccine is useful. 

To increase the potency of HTV- 1 vaccine candidates, codon modified Gag and 
Pol expression cassettes were designed, either for Gag alone or Gag plus Pol. To 

IS evaluate possible differences in expression and potency, the expression of these 
constructs was analyzed and immunogenicity studies carried out in mice. 

Several expression cassettes encoding Gag and Pol were designed, including, 
but not limited to, the following: GagProtease, GagPolAintegrase with firameshift 
(gagFSpol), and GagPolAintegrase in-frame (gagpol). Versions of GagPolAintegrase 

20 in-frame were also designed with attenuated (Att) or non-functional Protease (Ina). 
'Fhe nucleic acid sequences were codon modified to correspond to the codon usage of 
highly expressed human genes. Mice were immunized with titrated DNA doses and 
humoral and cellular mrniune responses evaluated by ELIS A and intracellular cytokine 
staining (Example 10). 

25 The immune responses in mice has been seen to be correlated with relative 

levels of expression in vitro. Vaccine studies in rhesus monkeys will further address 
immune responses and expression levels in vivo. 
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4.0 Enhanced Vaccine Technologies for the Induction of 
Potent Neutralizing Antibodies and Cellular Immune 
Responses Against HIV. 

While not desiring to be bound by any particular model, theory, or hypothesis, 

5 the following information is presented to provide a more complete understanding of 
the present invention. 

Protection against HIV infection will likely require potent and broadly reactive 
pre-existing neutralizing antibodies in vaccinated individuals exposed to a virus 
challenge. Although cellular immune responses are desirable to control viremia in 

10 those who get infected, protection against infection has not been demonstrated for 
vaccine approaches that rely exclusively on the induction of these responses. For this 
reason, experiments performed in support of the present invention use prime-boost 
approaches that employ novel V-deleted envelope antigens from primary HIV isolates 
(e.g., R5 subtype B (fflV-lspigj) and subtype C (HIV-Itvi) strains). These antigens 

15 were delivered by enhanced DNA [polyactide co-glycolide (PLG) microparticle 
formulations or electroporation] or alphavirus replicon particle-based vaccine 
approaches, followed by booster immunizations with Env proteins in MF59 adjuvant. 
Efficient in vivo expression of plasmid encoded genes by electrical permeabilization 
has been described (see, e.g., Zucchelli et al. (2000) J. Virol 74: 1 1598-1 1607; Banga 

20 et al. (1998) Trends Biotechnol 10:408-412; Heller et al. (1996) Febs Lett, 389:225- 
228; Mathiesen et al. (1999) Gene Then 4:508-514; Mir et al. (1999) Proc, NaflAcad 
Set USA 8:4262-4267; Nishi et al. (1996) Cancer Res. 5:1050-1055). Both native 
and V-deleted monomeric (gpl20) and oligomeric (o-gpl40) forms of protein from the 
SF162 strain were tested as boosters. All protein preparations were highly purified 

25 and extensively characterized by biophysical and immunochemical methodologies. 
Results from rabbit and primate immunogenicity studies indicated that, whereas 
neutralizing antibody responses could be consistently mduced against the parental non- 
V2-deleted SF162 virus, the induction of responses against heterologous HTV strains 
improved with deletion of the V2 loop of the immunogens. Moreover, using these 

30 prime-boost vaccine regimens, potent HIV antigen-specific CD4 + and CD8+ T-cell 
responses were also demonstrated. 
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Based on these findings, V2-deleted envelope DNA and protein vaccines were 
chosen for advancement toward clinical evaluation. Similar approaches for 
immunization may be en^loyed using, for exan;)le, nucleic acid immunization 
employing the synthetic HIV polynucleotides of the present invention coupled with 
5 corresponding or heterologous HTV-derived polypeptide boosts. 

One embodiment of this aspect of the present invention may be described 
generally as foUows. Antigens are selected for the vaccine composition(s). Env 
polypeptides are typically employed in a first antigenic conq)osition used to induce an 
immune response. Further, Gag polypeptides are typically employed in a second 

10 antigenic composition used to induce an immune response. The second antigenic 
composition may include further HTV-derived polypeptide sequences, including, but 
not limited to. Pol, Tat, Rev, Nef, Vif, Vpr, and/or Vpu sequences. A DNA prime 
vaccination is typically performed with the first and second antigenic compositions. 
Further DNA vaccinations with one or more of the antigenic compositions may also be 

1 5 included at selected time intervals. The prime is typically followed by at least one 
boost. The boost may, for example, include adjuvanted HTV-derived polypeptides 
(e.g., corresponding to those used for the DNA vaccinations), coding sequences for 
HTV-derived polypeptides (e.g., corresponding to those used for the DNA 
vaccinations) encoded by a viral vector, further DNA vaccinations, and/or 

20 combinations of the foregoing. In one embodnnent, a DNA prime is administered with 
a first antigenic composition (e.g.. a DNA construct encoding an Envelope 
polypeptide) and second antigenic composition (e.g., a DNA construct encoding a Gag 
polypeptide, a Pol polypeptide, a Tat polypeptide, a Nef polypeptide, and a Rev 
polypeptide). The DNA construct for use in the prime naay, for example, connprise a 

25 CMV promoter operably linked to the polynucleotide encoding the polypeptide 
sequence. The DNA prime is followed by a boost, for example, an adjuvanted 
Envelope polypeptide boost and a viral vector boost (where the viral vector encodes, 
e.g., a Gag polypeptide, a Pol polypeptide, a Tat polypeptide, a Nef polypq)tide, and a 
Rev polypeptide). Alternately (or in addition), the boost may be an adjuvanted Gag 

30 polypeptide, Pol polypeptide. Tat polypeptide, Nef polypeptide, and Rev polypeptide 
boost and a viral vector boost (where the viral vector encodes, e.g., an Envelope 
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polypeptide). The boost may include all polypeptide antigens which were encoded in 
the/DNA prime; however, this is not required. Further, different polypeptide antigens 
may be used in the boost relative to the initial vaccination and visa versa. Further, the 
initial vaccination may be a viral vector rather than a DNA construct. 
S Some factors that may be considered in HIV envelope vaccine design are as 

follows. Envelope-based vaccines have demonstrated protection agamst infection in 
non-human primate models. Passive antibody studies have demonstrated protection 
against EQV infection in the presence of neutralizing antibodies against the virus 
challenge .stock. Vaccines that exclude Env generally confer less protective efficacy. 

10 Experiments performed in support of the present invention have demonstrated that 
monomeric gpl20 protein-derived from the SF2 lab strain provided neutralization of 
HIV-1 lab strains and protection agamst virus challenges in primate models. Primary 
gp 1 20 protein derived from Thai E field strains provided cross-subtype neutralization 
of lab strains. Primary sub-type B oligomeric o-gpl40 protein provided partial 

15 neutralization of subtype B primary (field) isolates. Primary sub-type B o-gpl40AV2 
DNA prime plus protein boost provided potent neutralization of diverse subtype B 
primary isolates and protection against vims challenge in primate models. Primary 
sub-type C o-gpl40 and o-gpl40AV2 likely provide similar results to those just 
described for sub-type B. 

20 Vaccine strategies for induction of potent, broadly reactive, neutralizing 

antibodies may be assisted by construction of Envelope polypeptide structures that 
expose conserved neutralizmg epitopes, for exan:iple, variable-region deletions and de- 
glycosylations, envelope protein-receptor complexes, rational design based on crystal 
structure (e.g., p-sheet deletions), and gp41 -fusion domain based immunogens. 

25 Stable CHO cell lines for envelope protein production have been developed 

using optimized envelope polypeptide coding sequences, including, but not limited to, 
the foDowing: gpl20, o-gpl40, gpl20AV2, o-gpl40AV2, gpl20AVlV2, o- 
gpl40AVlV2, 

In addition, following prime-boost regimes (such as those described above) 
30 appear to be beneficial to help reduce viral load in infected subjects, as well as possibly 
slow or prevent progression of HIV-related disease (relative to untreated subjects). 
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Exemplary antigenic compositions and immunogenicity studies are presented in 
Examples9, 10, ll,andl2. 

Experimental 

5 Below are examples of specific enibodiments for carrying out the present 

invention. The examples are offered for illustrative purposes only, and are not 
intended to limit the scope of the present invention in any way. 

Efforts have been made to ensure accuracy with respect to numbers used (e.g., 
amounts, temperatures, etc.), but some experimental error and deviation should, of 
10 course, be allowed for. 

Example 1 

Creneration of Synthetic Expression Cassettes 
A. Generating Synthetic Polynucleotides 
1 5 The polynucleotide sequences of the present invention were manipulated to 

maximize expression of their gene products. The order of the following steps may 
vary. 

First, the HIV-1 codon usage pattern was modified so that the resulting nucleic 
acid coding sequence was comparable to codon usage found in highly expressed 

20 human genes. The HIV codon usage reflects a high content of the nucleotides A or T 
of the codon-triplet. The effect of the HIV-1 codon usage is a high AT content in the 
DNA sequence that results in a high AU content in the RNA and in a decreased 
translation ability and instability of the mRNA. In comparison, highly expressed 
human codons prefer the nucleotides G or C. The wild-type sequences were modified 

25 to be comparable to codon usage found in highly expressed human genes. 

Second, for some genes non-functional variants were created. In the following 
table (Table B) mutations affecting the activity of several HIV genes are disclosed. 
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Tables 



Gene 


'^Region" 


Exemplary Mutations 


Pol 


prot 


Att = Reduced activity by attenuation of Protease 
(Thr26Ser) (e.g., Konvalinka et al., 1995, J Virol 69: 
7180-86) 

Ina ss Mutated Protease nonfunctional enzvme 
(Asp25Ala)(e.g., Konvalinka et al., 1995, J Virol 69: 
7180-86) 




RT 


YM = Deletion of catalytic center (YMDD^AP; SEQ ID 
NO:7) (e.g., Biochemistry, 1995, 34, 5351, Patelet. al.) 
WM = Deletion of primer grip region (WMGY^PI; SEQ 
ID N0:8)) (e.g., J Biol aem, 272, 17. 1 1 157, 
Palaniappan, et. al., 1997) 




RNase 


no direct mutations, RnaseH is affected by **WM" 

mutation in RT 




Integrase 


1) Mutation of HHCC domain, Cys40Ala (e.g., 

Wiskerchen et. al., 1995, J Virol, 69: 376). 

2.) Inactivation catalytic center, Asp64Ala, Aspl 16 Ala, 

Glul52Ala (e.g., Wiskerchen et. al., 1995, J Virol. 69: 

376). 

3) Inactivation of minimal DNA binding domain 
(MDBD), deletion of Trp235(e.g., Ishikawa et. aL, 1999, 
J Virol, 73:4475). 

Constructs int.opt.mut.SF2 and int.opt.mut_C (South 
/viriua I y ij Duin coniajn eui inese muiauons ^i, anu 

3) 


Env 




Mutations in cleavage site (e.g., mutl-4, 7) 

Mutations in glycosylation site (e.g., GM mutants, for 
example, change Q residue in VI and/or V2 to N 
residue; may also be designated by residue ahered in 
sequence) 


Tat 


Mutants of Tat in transactivation domain (e.g., Caputo et 
al., 1996, Gene Ther. 3:235) 
cys22 mutant (Cys22Gly) = TatC22 
cys37 mutant (Cys37Ser) = TatC37 
cys22/37 double mutant = TatC22/37 
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Gene "Region" 


Exemplary Mutations 


Rev 


Mutations in Rev domains (e.g., Thomas et al., 1998, J 

Virol 72:2935-44) 

Mutation in RNA binding-nuclear localization 

ArgArg38,39AspLeu = MS 
Mutation in activation domain LeuGlu78,79AspLeu = 

MIO 


Nef 


Mutations of myristoylation signal and in oligomerization 

domain: 

1 . Single point mutation myristoylation signal: 
Gly-to-Ala = -Myr 

2. Deletion of N-terminal first 18 (sub-type B, e.g., 
SF162) or 19 (sub-type C, e.g., South Africa clones) 
amino acids: -MyrlS or -Myrl9 (respectively) 

(e.g., Peng and Robert-Guroff, 2001, Immunol Letters 
78: 195-200) 

Single point mutation oligomerization: 
(.e.g., uasx ei ai., zuuu, J viroi /4. D3iu-iyj 
Aspl25Gly (sub B SF162) or Aspl24Gly (sub C South 
Africa clones) 

Mutations affecting (1) infectivity (replication) of HTV- 
virions and/or ^2'! CD4 down recrulation (p g 
Lundquist et al. (2002) J Virol 76(9):4625-33) 


Vif 


Mutations of Vif: 

e.g., Simon et al., 1999, J Virol 73:2675-81 


Vpr 


Mutations of Vpr: 

e.g.. Singh et aL, 2000. J Virol 74: 10650-57 


Vpu 


Mutations of Vpu: 

e.g., Tiganos et al., 1998, Virology 251: 96-107 



Constructs comprising some of these mutations are described herein. Vif, vpr 
and vpu synthetic constructs are described. Reducing or eliminating the function of 
the associated gene products can be accomplished employing the teachings set forth in 
10 the above table, in view of the teachings of the present specification. 

In one embodiment of the invention, the full length coding region of the Gag- 
polymerase sequence is included with the synthetic Gag sequences in order to increase 
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the number of epitopes for virus-like particles expressed by the synthetic, optimized 
Gag expression cassette. Because synthetic HIV-1 Gag-polymerase expresses the 
potentially deleterious functional en^mes reverse transcriptase (RT) and integrase 
(INT) (in addition to the structural proteins and protease), it is important to inactivate 
S RT and INT functions. Several in-firame deletions in the RT and INT reading frame 
can be made to achieve catalytic nonfunctional enzymes with respect to their RT and 
INT activity. {Jay. A. Levy (Editor) (1995) The Retroviridae, Plenum Press, New 
York. ISBN 0-306-45033X. Pages 215-20; Grimison, B. and Laurence, J. (1995), 
Journal Of Acquired Immune Deficiency Syndromes and Human Retrovirology 

10 9(l):58-68; Wakefield, J. K,et aL, (1992) Journal Of Virology 66(11):6806-6812; 
Esnouf, R.,et aL, (1995) Nature Structural Biology 2(4y.303-30S; Maignan, S., et aL, 
(1998) Journal Of Molecular Biology 282(2):359-368; Katz, R. A. and Skalka, A. M. 
(1994) Annual Review Of Biochemistry 73 (1994); Jacobo-Molina, A., et al., (1993) 
Proceedings Of the National Academy Of Sciences Of the United States Of America 

15 90(13):6320-6324; Hickman, A. B., et al., (1994) Journal Of Biological Chemistry 
269(46):29279-29287; Goldgur, Y., et al., (1998) Proceedings Of the National 
Academy Of Sciences Of the United States Of America 95(16):9150-9154; Goette, 
M., et al., (1998) Journal Of Biological Chemistry 273(17):10139-10146; Gorton, J. 
L., et al., (1998) Journal of Virology 72(6):5046-5055; Engehnan, A., et al., (1997) 

20 Journal Of Wro/ogy 71(5):3507-3514; Dyda, R, et aL, Science 266(5193):1981-1986; 
Davies, J. F., et al., (1991) Science 252(5002):88-95; Bujacz, G., et al., (1996) Febs 
Letters 398(2-3):175-178; Beard, W. A., et al., (1996) Journal Of Biological 
Chemistry 27l(21):12213-12220; Kohlstaedt, L. A., et al., (1992) Science 
256(5065): 1783-1790; Krug, M. S. and Berger, S. L. (1991) Biochemistry 

25 30(44): 10614-10623; Mazumder, A., et aL, (1996) Molecular Pharmacology 
49(4):621-628; Palaniappan, C., et aL, (1997) Journal Of Biological Chemistry 
272(17): 1 1 157-1 1 164; Rodgers. D. W., et al., (1995) Proceedings Of the National 
Academy Of Sciences Of the United States Qf America 92(4): 1222-1226; Sheng. N. 
and Dennis, D. (1993) Biochemistry 32(18):4938-4942; Spence, R. A., et al., (1995) 

30 Science 267(5200):988-993.} 
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Furthermore selected B- and/or T-cell epitopes can be added to the Gag- 
polymerase constructs within the deletions of the RT- and INT-coding sequence to 
replace and augment any epitopes deleted by the functional modifications of RT and 
INT. Alternately, selected B- and T-cell epitopes (including CTL epitopes) from RT 
5 and INT can be included in a minimal VLP formed by expression of the synthetic Gag 
or synthetic GagProt cassette, described above. (For descriptions of known HIV B- 
and T-cell epitopes see, HIV Molecular Immunology Database CTL Search Interface; 
Los Alamos Sequence Compendia, 1987-1997Jntemet address: http://hiv- 
web.lanLgov/immunology/index.html.) 

10 In another aspect, the present invention comprises Env coding sequences that 

include, but are not limited to, polynucleotide sequences encoding the following HIV- 
encoded polypeptides: gpl60, gpl40, and gpl20 (see, e.g., U,S. Patent No. 5,792,459 
for a description of the HIV-lspi ("SF2") Env polypeptide). The relationships between 
these polypeptides is shown schematically in Figure 3 (in the figure: the polypeptides 

15 are indicated as lines, the amino and carboxy termini are indicated on the gp 160 line; 
the open circle represents the oligomerization domain; the open square represents a 
transmembrane spanning domain (TM); and "c" represents the location of a cleavage 
site, in gpMO.mut the **X" indicates that the cleavage site has been mutated such that it 
no longer functions as a cleavage site). The polypeptide gpl60 includes the coding 

20 sequences for gpl20 and gp41 . The polypeptide gp41 is comprised of several domains 
including an oligomerization domain (OD) and a transmembrane spanning domain 
(TM). In the native envelope, the oligomerization domain is required for the non- 
covalent association of three gp41 polypeptides to form a trimeric structure: through 
non-covalent interactions with the gp41 trimer (and itself), the gpl20 polypeptides are 

25 also organized in a trimeric structure. A cleavage site (or cleavage sites) exists 
approximately between the polypeptide sequences for gpl20 and the polypeptide 
sequences corresponding to gp41. This cleavage site(s) can be mutated to prevent 
cleavage at the site. The resulting gpl40 polypeptide corresponds to a truncated form 
of gp 1 60 where the transmembrane spanning domain of gp41 has been deleted. This 

30 gp 140 polypeptide can exist in both monomeric and oligomeric (Le. trimeric) forms by 
virtue of the presence of the oligomerization domain in the gp41 moiety. In the 
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situation where the cleavage site has been mutated to prevent cleavage and the 
transmembrane portion of gp41 has been deleted the resulting polypeptide product is 
designated "mutated" gpl40 (e.g., gpl40.mut). As will be apparent to those in the 
field, the cleavage site can be mutated m a variety of ways. (See, also, WO 00/39302). 
5 Wild-type HIV coding sequences (e.g., Gag, Env, Pol, tat, rev, nef, vpr, vpu, 

vif , etc.) can be selected from any known HIV isolate and these sequences 
manipulated to maximize expression of their gene products following the teachings of 
the present invention. The wild-type coding region maybe modified in one or more of 
the following ways. In one embodiment, sequences encoding bypervariable regions of 
10 Env, particularly VI and/or V2 were deleted. In other embodiments, mutations were 
introduced into sequences, for example, encoding the cleavage site in Env to abrogate 
the enzymatic cleavage of oligomeric gpl40 

into gpl20 monomers. (See, e.g., Earl et aL (1990) PNAS USA 87:648-652; Earl et al. 
(1991) /. ViroL 65:31-41). In yet other embodiments, hypervariable region(s) were 

15 deleted, N-glycosylation sites were removed and/or cleavage sites mutated. As 

discussed above, different mutations may be introduced into the coding sequences of 
different genes (see, e.g., Table B). For example, Tat coding sequences were modified 
according to the teachings of the present specification, for example to affect the 
transactivation domain of the gene product (e.g., replacing a cystein residue at position 

20 22 with a glycine, Caputo et al. (1996) Gene Therapy 3:235). 

To create the synthetic coding sequences of the present invention the gene 
cassettes are designed to comprise the entire coding sequence of interest. Synthetic 
gene cassettes are constructed by oligonucleotide synthesis and PGR an:5>lification to 
generate gene fragments. Primers are chosen to provide convenient restriction sites 

25 for subcloning. The resulting fragments are then ligated to create the entire desired 
sequence which is then cloned into an appropriate vector. The final synthetic 
sequences are (i) screened by restriction endonuclease digestion and analysis,(ii) 
subjected to DNA sequencing in order to confirm that the desired sequence has been 
obtained and (iii) the identity and integrity of the expressed protein confirmed by SDS- 

30 PAGE and Western blotting. The synthetic coding sequences are assembled at Chiron 
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Corp. (Emeryville, CA) or by the Midland Certified Reagent Company (Midland, 
Texas). 

Percent identity to the synthetic sequences of the present invention can be 
determined, for exanq)le, using the Smith-Waterman search algorithm (Time Logic, 
5 Incline ViUage, NV), with the following exemplary parameters: weight matrix = 

nuc4x4hb; gap opening penalty = 20, gap extension penalty = 5, reporting threshold = 
1 ; alignment threshold = 20. 

Various forms of the different enibodiments of the present invention (e.g., 
constructs) may be combined. 
10 Exemplary embodiments of the synthetic polynucleotides of the present 

invention include, but are not limited to, the sequences presented in Table C. 

Table C 

V 

Type C Synthetic, Codon Optimized Polynucleotides 



Name 


Figure 
Number 


Description (encoding) 


GagComplPolmut^C 
(SEQ ED NO:9) 


6 


Gag complete, Pol, RT 
mutated; all in-frame 


GagCompIPohnutAtt_C 
(SEQ ID NO: 10) 


7 


Gag con5)lete, Pol, RT 
mutated, protease attenuated; 
all in-frame 


GagComplPolmutIna_C 
(SEQ ID NO: 11) 


8 


Gag complete, Pol, RT 
mutated, protease non- 
functional; aU in-frame 


GagComplPolmutlnaTatRevNefX 
(SEQ ID NO: 12) 


9 


Gag complete, Pol, RT 
mutated, protease non- 
functional, tat mutated, rev 
mutated, nef mutated; all in- 
frame 


GagPolmut^C 
(SEQIDN0:13) 


10 


Gag, Pol, RT mutated; all in- 
frame 


GagPolmutAtCC 
(SEQ ID NO: 14) 


11 


Gag, Pol, RT mutated, protease 
attenuated; all in-frame 


GagPolmutlna.C 
(SEQ ID NO: 15) 


12 


Gag, Pol, RT mutated, protease 
non-functional; all in-frame 
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Name 


Figure 
Number 


Description (encoding) 




GagProtInaRTmut_C 
(SEQroN0:16) 


13 


Gag, protease non-fiinctional, 
RT mutated; all in-frame 




GagProtlnaRTmutTatRevNeLC 
(SEQIDN0:17) 


14 


Gag, protease non-functional, 
RT mutated, tat mutated, rev 
mutated, nef mutated; all in- 
frame 


5 


GagRTmut_C 
(SEQIDN0:18) 


15 


Gag, RT mutated; all in-fi:ame 




GagRTmutTatRevNef_C 

(SEQrDNO:19) 


16 


Gag, RT mutated, tat mutated, 
rev mutated, nef mutated; all in- 
frame 


10 


GagTatRevNef_C 
(SEQ ID NO:20) 


17 


Gag, tat mutated, rev mutated, 
nef mutated; aU in-frame 




gp 1 20mod.TV 1 .dell 18-210 
(SEQIDNO:21) 


18 


2d120 derived fromTVl.cS 2 
deleted V1A^2 loops and stem 




gpl20mod.TVl.delVlV2 
(SEQ ID NO:22) 


19 


9d120 derived fromTVl c8 2 
deleted Yl/Vl loops 


15 


gpl20mod.TVl .delV2 
(SEQ ID NO:23) 


20 


ffnl 20 derived from TVl c8 7. 
deleted V2 loop 




£Dl40mod.TVl.dell 18-210 
(SEQIDNO:24) 


21 


ml 40 derived from TVl c8 2 
deleted VI A^2 loops and stem 


20 


gp 140mod.TVl .delVl V2 

(SEQIDNO:25) 


22 


21)140 derived fromTVl c8 2 
deleted Vim loops 




gpl40mod.TVl.delV2 
(SEQIDNO:26) 


23 


sd140 derived from TVl c8 2 
deleted V2 loop 




gpl40mod.TVl.raut7 
(SEQ ID NO:27) 


24 


gpl40 derived from TVl. c8.2, 
mutated nrotease cleavape citp 


25 


gpl40mod.TVl.tpa2 
(SEQ ID NO:28) 


25 


gpl40 derived fromTVl.c8.2, 
tpa2 leader sequence 




gpUOTMmod.TVl 
(SEQ ID NO:29) 


26 


gpl40 derived from TVLc8.2, 
containing the transmembrane 

region 


30 


gpl60inod.TVl.dell 18-210 
(SEQIDNO:30) 


27 


gpl60 derived fromTVl.c8.2, 
deleted VI A^2 loops and stem 
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Figure 
Number 


DescriDtioii fencodinff'^ 




gpl60mod.TVl.delVlV2 
(SEQIDNO:31) 


28 


fiol60 derived fromTVl.cS 2. 
deleted Y1/V2 loops 




gpl60mod.TVl.delV2 
fSEO ID NO:32) 


29 


gpl60 derived from TVl.c8.2, 
deleted V2 loon 


5 


gpl60mod.TVl.dVl 
(SEQ ID NO:33) 


30 


gpl60 derived from TVLc8.2. 
deleted VI loop 




gpl60mod.TVl.dVl- 

gagmod.BW965 

(SEQIDNO:34) 


31 


gpl60 derived from TVl.c8.2, 
deleted VI loop, Gag derived 
from BW965; all in-frame 


10 


gpl60mod.TVl.dVlV2- 

gagmod.BW965 

(SEQIDNO:35) 


32 


gpl60 derived fromTVl.c8.2, 
deleted Vim loops, Gag 
derived fromBW965; all in- 
frame 


15 


gpl60mod.TVl.dV2- 

gagmod.BW965 
CSEO ID NO-36'> 


33 


gpl60 derived from TVl.c8.2, 

deleted V2 loop, Gag derived 

from all in-frflmp 




gpl60mod.TVl.tpa2 
fSEO ID NO:37) 


34 


gp 160 derived from TV 1 .c8.2, 
tDa2 leader' all in-frame 




gp 1 60mod.TV 1-gagraod.B W965 
fSEO ID NO'38) 


35 


gpl60 derived fromTVl.c8,2, 
Gaff derived from BW965* all 
in-frame 


20 


int oot mut C 
(SEQIDNO:39) 


36 


inte?ra5e mutated 




int oot C 
(SEQ ID NO:40) 


37 


inf AQTSICP 

ILiVwKX ClOw 


25 


nef D106G -mvrl9 ODt C 
(SEQIDNO:41) 


38 






olSRnaseH oot C 
(SEQIDNO:42) 


39 


nl S RNaQe T-T* all in-frflmp 




p2PoLopt.YMWR.C 
(SEQIDNO:43) 


40 


p2 Pol, RT mutated YM WM; 
all in-frame 


30 


p2Polopt.YM_C 
(SEQIDNO:44) 


41 


p2 pol, RT mutated YM; all in- 
frame 




p2PoIopt_C 
(SEQ ID NO:45) 


42 


p2 Pol; all in-frame 
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Name 


Figure 
Number 


Description (encoding) 


p2PolTatRevNefoptC 
(SEQIDNO:46) 


43 


p2 Pol, RT mutated, protease 
non-functional, tat nmtated, rev 
mutated, nef mutated; all in- 
frame 


p2PolTatRevNef.opt.native_C 
(SEQ ID NO:47) 


44 


p2 pol, tat native, rev native, 
nef native; all in-frame 


p2PolTatRevNef.opLC 
(SEQE)NO:48) 


45 


p2 Pol, RT mutated, protease 
non-functional, tat mutated, rev 
mutated, nef mutated; all in- 
frame; all in-frame 


protlnaRT.YM.opLC 
(SEQIDNO:49) 


46 


Protease non-functional, RT 
mutated YM; all in-frame 


protInaRT,YMWM.opt_C 
(SEQ ID NO:50) 


47 


Protease non-functional, RT 
mutated YM WM; all in-frame 


ProiRT.TatRevNef.opt^C 
(SEQIDN0:51) 


48 


RT mutated, Protease non- 
fimctional, tat mutated, rev 
mutated, nef mutated* all in- 
frame 


rev.exonl_2.M5-10.opt_C 
(SEQIDNO:52) 


49 


rev exons 1 and 2 mutated; all 
in-frame 


tat.exonl_2.opt.C22-37_C 
(SEQIDNO:53) 


50 


tat exons 1 and 2 mutated; all 

in-franae 


tat . exon 1 _2 . opt . C37_C 
(SEQIDNO:54) 


51 


tat exon 1 and 2 inutaterf* all in- 
frame 


TatRevNef.opt,native_ZA 
(SEQIDNO:55) 


52 


tat native rev native nef nntive* 

all in-frame 


TatRevNef,opt_ZA 
(SEQlDNO:56) 


53 


tat mutated, rev mutated, nef 

mutated; all in-frame 


TatRevNefGag C 
(SEQIDNO:57) 


54 


tat mutated, rev mutated, nef 
mutated, Gag; all in-frame 


TatRevNefgagCpoHna C 
(SEQIDNO:58) 


55 


tat mutated, rev mutated, nef 
mutated. Gag complete, pol, RT 
mutated, protease non- 
functional; all in-frame 
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Name 


Figure 
Number 


Description (encoding) 


TatRevNefGagProtlnaRTmut C 
(SEQIDNO:59) 


56 


tat mutated, rev mutated, nef 
mutated, Gag, Protease non- 
functional, RT mutated; all in- 
frame 


TatRevNefProtRT opt C 
(SEQ1DNO:60) 


57 


tat mutated, rev mutated, nef 
mutated, protease non- 
functional, RT mutated; all in- 
frame 


gpl40modTVl.mutl.dV2 (SEQ ID 


104 


env derived from TVl mutated 
Hi ccjiuiar proiease cicavage sue 
between gpl20/gp41 (may 
prevent cleavage and may 
facilitate protein purification) 

deletion in second variable 
region (V2) 


gpl40modTVl.mut2.dV2 (SEQ ID 


105 


env derived from TVl mutated 
in ccQular protease cleavage site 
between gpl20/gp41 (nniay 
prevent cleavage and may 
facilitate protein purification) 

deletion in second variable 
region (V2) 


gpl40modTVl.mut3.dV2 (SEQ ID 


106 


env derived from TVl mutated 
in ceuuxar proiease cleavage site 
between gpl20/gp41 (may 
prevent cleavage and may 
facilitate protein purification) 

deletion in second variable 
region (V2) 


gpl40modTVl.mut4.dV2 (SEQ ID 
NO: 186) 


107 


env derived from TVl mutated 

111 v/CiLUicu piv/lCdoC v/iCcLVagc SllC 

between gpl20/gp41 (may 
prevent cleavage and may 
facilitate protein purification) 

deletion in second variable 
region (V2) 
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Name 


Figure 
Number 


Description (encoding) 


gpl40inodTVl.GM161 (SEQID 
NO: 187) 


108 


env derived from TVl 
glycosylation site mutation 
(GM) at amino acid position 
161 of Env (N to Q 
substitution) 


gp 1 40modT V 1 . GM 1 6 1 - 1 95-204 
(SEQIDNO:188) 


109 


env derived from TVl 
glycosylation site mutation 
(GM) at amino acid positions 
161, 195 and 204 of Env (N to 
Q substitution) 


gpl40modTVl.GM161-204 (SEQ 
ID NO: 189) 


110 


env derived from TVl 
glycosylation site mutation 
(GM) at amino acid positions 
161 and 204 ofEnv(NtoQ 
substitution) 


gpl40niod.TVl.GM-VlV2 (SEQ 
ID NO: 190) 


111 


env derived from TVl 
glycosylation site mutation 
(GM) at various amino acid 
positions (see also PIG 1 14) 


gpl40modC8.2mut7.delV2.Ko2mDd.Ta 
(SEQIDN0:191) 


112 


env derived from TVl mutated 
in cellular protease cleavage site 
between gpl20/gp41 (may 
prevent cleavage and may 
facilitate protein purification) 

deletion in second variable 
region (V2) 

5' Kozak sequence and 3' 
TAAA termination sequence 


Nef-myrD124LLAA (SEQ ID 
NO:203) , 


115 


Nef with mutation in 
myristoylation site 


gpl60mod.TV2 (SEQ ID NO:205) 


117 


env derived from T V2 



B. Creating Expression Ca ssettes Comprising the Svnthetic Polynucleotides of the 
Present Invention. 

The synthetic DNA fragments of the present invention are cloned into the 
following expression vectors: pCMVKm2, for transient expression assays and DNA 
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immunization studies, the pCMVKm2 vector was derived fi:ompCMV6a (Chapman et 
al., Nuc. Acids Res. (1991) 19:3979-3986) and comprises a kanamycin selectable 
marker, a CoIEl origin of replication, a CMV promoter enhancer and Intron A, 
followed by an insertion site for the synthetic sequences described below foflowed by a 
5 polyadenylation signal derived from bovine growth hormone - the pCMVKm2 vector 
differs from the pCMV-link vector only in that a polylinker site was inserted into 
pCMVKm2 to generate pCMV-link; pESN2dhfr and pCMVPLEdhfr (also known as 
pCMVni), for expression in Chinese Hamster Ovary (CHO) cells; and, pAcClS, a 
shuttle vector for use in the Baculovirus expression system (pAcClB, was derived 

10 from pAcC12 which was described by Munemitsu S., et al.» Mol Cell Biol 

10(1 1):5977-5982, 1990). See, also co-owned WO 00/39303, WO 00/39302, WO 
00/39304, WO 02/04493. for a description of these vectors. 

Briefly, constraction of pCMVPLEdhfr (pCMVm) was as follows. To 
construct a DHFR cassette, the EMCV IRES (internal ribosome entry site) leader was 

15 PCR-amplified from pCite-4a+ (Novagen, Inc., Milwaukee, WI) and inserted into 
pET-23d (Novagen, Inc., Milwaukee, WI) as an Xba-Nco fragment to give pET- 
EMCV. The dhfr gene was PCR-amplified from pESN2dhfr to give a product with a 
Gly-Gly-Gly-Ser spacer in place of the translation stop codon and inserted as an Nco- 
BamYLl fragment to give pET-E-DHFR. Next, the attenuated neo gene was PGR 

20 amplified from a pSV2Neo (Clontech, Palo Alto, CA) derivative and inserted into the 
unique BamUl site of pET-E-DHFR to give pET-E-DHFR/NeO(^2)- Then, the bovine 
growth hormone terminator from pCDNA3 (Invitrogen, Inc., Carlsbad, CA) was 
inserted downstream of the neo gene to give pET-E-DHFR/Neo^n^^BGHt. The 
EMCV'dhfr/neo selectable marker cassette fragment was prepared by cleavage of 

25 pET-E-DHFR/NeO(^)BGHt, The CMV enhancer/promoter plus Intron A was 

transferred fiiompCMV6a (Chapman et al., Nuc. Acids Res. (1991) 19:3979-3986) as 
a HindiaSall fragment into pUC19 (New England Biolabs, Inc., Beverly, MA). The 
vector backbone of pUC19 was deleted from the Ndel to the Sapl sites. The above 
described DHFR cassette was added to the construct such that the EMCV IRES 

30 followed the CMV promoter to produce the final construct. The vector also contained 
an amp' gene and an S V40 origin of replication. 



108 



wo 03/004620 



PCTAJS02/21420 



Expression vectors of the present invention contain one or more of the 
synthetic coding sequences disclosed herein, e.g., shown in the Figures. When the 
expression cassette contains more than one coding sequence the coding sequences may 
all be in-frame to generate one polyprotein; alternately, the miore than one polypeptide 
S coding sequences may comprise a polycistronic message where, for example, an IRES 
is placed 5' to each polypeptide coding sequence. 

Example 2 
Expression Assays for the 
10 Synthetic Coding Sequences 

The wild-type sequences are cloned into expression vectors having the same 
features as the vectors into which the synthetic HIV-derived sequences were cloned. 

Expression efGciencies for various vectors carrying the wild-type (any known 
isolated) and corresponding synthetic sequence(s) are evaluated as follows. Cells from 
15 several mammalian cell lines (293, RD, COS-7, and CHO; all obtained from the 
American Type Culture Collection, 10801 University Boulevard, Manassas, VA 
20 11 0-2209) are transfected with 2 |ig of DNA in transfection reagent LTl (PanVera 
Corporation, 545 Science Dr., Madison, WI). The cells are incubated for 5 hours in 
reduced serum medium (Opti-MEM, Gibco-BRL, Gaithersburg, MD). The medium is 
20 then replaced with normal medium as follows: 293 cells, IMDM, 10% fetal calf serum, 
2% glutamine (BioWhittaker, Walkersville. MD); RD and COS-7 cells, D-MEM, 10% 
fetal calf serum, 2% glutamine (Opti-MEM, Gibco-BRL, Gaithersburg, MD); and 
CHO cells, Ham's F-12, 10% fetal calf serum, 2% glutamine (Opti-MEM, Gibco-BRL, 
Gaithersburg, MD). The cells are incubated for either 48 or 60 hours. Supematants 
25 are harvested and filtered through 0.45 jim syringe filters and, optionally, stored at - 
20X. 

Supematants are evaluated using the Coulter p24-assay (Coulter Corporation, 
Hialeah, PL, US), using 96*-well plates coated with a suitable monoclonal antibody 
directed against an HIV antigen {e.g, a murine monoclonal directed again an HIV core 
30 antigen). The appropriate HIV antigen binds to the coated wells and biotinylated 
antibodies against HTV recognize the bound antigen. Conjugated strepavidin- 
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horseradish peroxidase reacts with the biotin. Color develops from the reaction of 
peroxidase with TMB substrate. The reaction is terminated by addition of 4N H2SO4. 
The intensity of the color is directly proportional to the amount of HIV antigen in a 
sample. 

S Chinese hamster ovary (CHO) cells are also transfected with plasmid DNA 

encoding the synthetic HTV polypeptides described herein (e.g., pESN2dhfr or 
pCMVin vector backbone) using Minis TransIT-LTl polyamine transfection reagent 
(Pan Vera) according to the manufacturers instructions and incubated for 96 hours. 
After 96 hours, media is changed to selective media (F12 special with 250 jig/ml 

10 G418) and cells are split 1 :S and incubated for an additional 48 hours. Media is 
changed every 5-7 days until colonies start forming at which time the colonies are 
picked, plated into 96 well plates and screened by Capture EUS A. Positive clones are 
expanded in 24 well plates and are screened several times for HIV protein production 
by Capture ELIS A^ as described above. After reaching confluency in 24 well plates, 

1 5 positive clones are expanded to T25 flasks (Coming, Corning, NY). These are 

screened several times after confluency and positive clones are expanded to T75 flasks. 

Positive T75 clones are frozen in LN2 and the highest expressing clones are 
amplified with 0-5 ^M methotrexate (MTX)at several concentrations and plated in 
100mm culture dishes. Plates are screened for colony formation and all positive closed 

20 are again expanded as described above. Clones are expanded an amplified and 

screened at each step capture ELIS A. Positive clones are frozen at each methotrexate 
level. Highest producing clones are grown in perfusion bioreactors (3L, lOOL) for 
expansion and adaptation to low serum suspension culture conditions for scale-up to 
larger bioreactors. 

25 Data from experiments performed in support of the present invention show that 

the synthetic HIV expression cassettes provided dramatic increases in production of 
their protein products, relative to the native (wild-type) sequences, when expressed in 
a variety of cell lines and that stably transfected CHO cell lines, which express the 
desired HTV polypeptide(s), may be produced. Production of HTV polypeptides using 

30 CHO cells provides (i) correct glycosylation patterns and protein conformation (as 
determined by binding to panel of MAbs); (ii) correct binding to CD4 receptor 
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molecules; (iii) absence of non-mammalian cell contaminants (e.g., insect viruses 
and/or cells); and (iv) ease of purification. 

Example 3 

5 Western Blot Analysis of Expression 

Western blot analysis of cells transfected with the HIV expression cassettes 
described herein are performed essentiaUy as described in co-owned WO 00/39302. 
Briefly, human 293 cells are transfected as described in Example 2 with pCMV6a- 
based vectors containing native or synthetic HIV expression cassettes. Cells are 

10 cultivated for 60 hours post-transfection. Supematants are prepared as described. 
Cell lysates are prepared as follows. The cells are washed once with phosphate- 
buffered saline, lysed with detergent [1% NP40 (Sigma Chemical Co., St. Louis, MO) 
in 0.1 M Tris-HCl, pH 7.5], and the lysate transferred into fresh tubes. SDS- 
polyacrylamide gels (pre-cast 8-16%; Novex, San Diego, CA) are loaded with 20 ^il of 

15 supernatant or 12.5 nl of cell lysate. A protein standard is also loaded (5 |il, broad 
size range standard; BioRad Laboratories, Hercules, CA). Electrophoresis is carried 
out and the proteins are transferred using a BioRad Transfer Chamber (BioRad 
Laboratories, Hercules, CA) to Immobilon P membranes (Millipore Corp., Bedford, 
MA) using the transfer buffer recommended by the manufacturer (Millipore), where 

20 the transfer is performed at 100 volts for 90 minutes. The membranes are exposed to 
HIV- 1 -positive human patient serum and immunostained using o-phenylenedianoine 
dihydrochloride (OPD; Sigma). 

The results of the immunoblotting analysis are used to show that cells 
containing the synthetic HIV expression cassette produce the expected HIV- 

25 polypeptide(s) at higher per-cell concentrations than cells containing the native 
expression cassette. 
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Example 4 

In Vivo Immunogenicitv of Synthetic HIV Expression Cassettes 
A. Immunization 

To evaluate the immunogenicity of the synthetic HIV expression cassettes, a 
5 mouse study may be performed. The plasmid DNA, e.g., pCMVKM2 carrying an 
expression cassette comprising a synthetic sequence of the present invention, is diluted 
to the following fmal concentrations in a total injection volume of 100 |il: 20 |ig, 2 jig, 
0.2 |ig, and 0.02 fig. To overcome possible negative dilution effects of the diluted 
DNA, the total DNA concentration in each san^le is brought up to 20 ^g using the 
10 vector (pCMVKM2) alone. As a control, plasmid DNA comprising an expression 

cassette encoding the native, corresponding polypeptide is handled in the same manner. 
Twelve groups of four Balb/c mice (Charles River, Boston, MA) are intramuscularly 
immunized (SO ^1 per leg, intramuscular injection into the tibialis anterior) using 
varying dosages. 

15 

Bi Humoral Immune Response 

The humoral immune response is checked with a suitable anti-HIV antibody 
ELISAs (enzyme-linked immunosorbent assays) of the mice sera 0 and 4 weeks post 
immunization (groups 5-12) and, in addition, 6 and 8 weeks post immunization, 

20 respectively, 2 and 4 weeks post second immunization (groups 1-4). 

The antibody titers of the sera are determined by anti-HTV antibody ELISA. 
Briefly, sera from immunized mice were screened for antibodies directed against an 
appropriate HTV protein {e.g., HIV p55 for Gag). ELISA microtiter plates are coated 
with 0.2 |jg of HIV protein per well overnight and washed four times; subsequently, 

25 blocking is done with PBS-0.2% Tween (Sigma) for 2 hours. After removal of the 
blocking solution, 100 jiil of diluted mouse serum is added. Sera are tested at 1/25 
dilutions and by serial 3-fold dilutions, thereafter. Microtiter plates are washed four 
times and incubated with a secondary, peroxidase-coupled anti-mouse IgG antibody 
(Pierce. Rockford, IL). ELISA plates are washed and 100 jil of 3, 3', 5, 5 -tetramethyl 

30 benzidine (TMB; Pierce) was added per well. The optical density of each weD is 
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measured after 15 minutes. The titers reported are the reciprocal of the dilution of 
serum that gave a half-maximum optical density (O.D.). 

The results of the mouse immunizations with plasmid-DNAs are used to show 
that the synthetic expression cassettes provide improvement of immunogenicity 
5 relative to the native expression cassettes. Also, the second boost immunization 
induces a secondary immune response after two weeks (groups 1-3). 

C. Cellular Immune Response 

The frequency of specific cytotoxic T-lyn5)hocytes (CTL) is evaluated by a 

10 standard chromium release assay of peptide pulsed Balh/c mouse CD4 cells. HIV 

protein-expressing vaccinia virus infected CD-8 cells are used as a positive control (w- 
protein). Briefly, spleen cells (Effector cells, E) are obtained from the BALB/c mice 
(immunized as described above). The cells are cultured, restimulated, and assayed for 
CTL activity against, e.g., Gag peptide-pulsed target cells as described (Doe, B., and 

15 Walker, CM., AIDS 10(7):793-794, 1996). Cytotoxic activity is measured in a 
standard ^^Cr release assay. Target (T) cells are cultured with effector (E) cells at 
various E:T ratios for 4 hours and the average cpm from duplicate wells is used to 
calculate percent specific ^^Cr release. 

Cytotoxic T-cell (CTL) activity is measured in splenocytes recovered from the 

20 mice immunized with HIV DNA constructs described herein. Effector cells from the 
DNA-immunized animals exhibit specific lysis of HIV peptide-pulsed SV-BALB 
(MHC matched) targets cells indicative of a CTL response. Target cells that are 
peptide-pulsed and derived from an MHC-unmatched mouse strain (MC57) are not 
lysed. The results of the CTL assays are used to show increased potency of synthetic 

25 HTV expression cassettes for induction of cytotoxic T-lymphocyte (CTL) responses by 
DNA immunization. 

Exany le 5 

In Vivo Immunogenicitv of Svnthetic HIV Expression Cassettes 
A, General Immunization Methods 
30 To evaluate the immunogenicity of the synthetic HIV expression cassettes, 

studies using guinea pigs, rabbits, mice, rhesus macaques and baboons are performed. 
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The studies are typically structured as follows: DNA immunization alone (single or 
multiple); DNA immunization followed by protein immunization (boost); DNA 
immunization followed by Sindbis particle immunization; immunization by Sindbis 
particles alone. 

5 

B. Guinea Pigs 

Experiments may be performed using guinea pigs as follows. Groups 
comprising six guinea pigs each are immunized intramuscularly or mucosally at 0, 4, 
and 12 weeks with plasmid DNAs encoding expression cassettes con:q)rising one or 

1 0 more the sequences described herein. The animals are subsequently boosted at 

approximately 18 weeks with a single dose (intramuscular, intradermally or mucosally) 
of the HTV protein encoded by the sequence(s) of the plasmid boost and/or other HIV 
proteins. Antibody titers (geometric mean titers) are measured at two weeks following 
the third DNA unmunization and at two weeks after the protein boost. These results 

15 are used to demonstrate the usefulness of the synthetic constructs! to generate immune 
responses, as well as, the advantage of providing a protein boost to enhance the 
immune response following DNA unmunization. 



C. Rabbits 

20 Experiments may be performed using rabbits as follows. Rabbits are 

immunized intramuscularly, mucosally, or intradermally (using a Bioject needless 
syringe) with plasmid DNAs encoding the HIV proteins described herein. The nucleic 
acid immunizations are followed by protein boosting after the initial immunization. 
Typically, constructs comprising the synthetic HIV-polypeptide-encoding 

25 polynucleotides of the present invention are highly immunogenic and generate 

substantial antigen binding antibody responses after only 2 immunizations in rabbits. 



D. Humoral Immune Response 

In any immunized animal model, the humoral immune response is checked in 
30 serum specimens from the immunized animals with an anti-HTV antibody ELISAs 
(enzyme-linked immunosorbent assays) at various times post-immunization. The 
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antibody titers of the sera are determined by anti-HIV antftody ELIS A as described 
above. Briefly, sera from immunized animals are screened for antibodies directed 
against the HIV polypeptide/protein(s) encoded by the DNA and/or polypeptide used 
to immunize the animals. Wells of ELISA microtiter plates are coated overnight with 
5 the selected HIV polypeptide/protein and washed four times; subsequently, blocking is 
done with PBS-0.2% Tween (Sigma) for 2 hours. After removal of the blocking 
solution, 100 |il of diluted mouse serum is added. Sera are tested at 1/2S dilutions and 
by serial 3-fold dilutions, thereafter. Microtiter plates are washed four times and 
incubated with a secondary, peroxidase-coupled anti-mouse IgG antibody (Pierce, 
10 Rockford, IL). ELISA plates are washed and 100 |il of 3, 3* , 5, 5 -tetramethyl 
benzidine (TMB; Pierce) was added per well. The optical density of each well is 
measured after IS minutes. Titers are typically reported as the reciprocal of the 
dilution of serum that gave a half-noaximum optical density (O.D.). 
Cellular immune response may also be evaluated. 

15 

Example 6 

DNA-immunization of Baboons and Rhesus Macaques Using Expression Cassettes 
Comprising the Synthetic HTV Polynucleotides of the Present Invention 
A. Baboons 

20 Four baboons are immunized 3 times (weeks 0, 4 and 8) bilaterally, 

intramuscular into the quadriceps or mucosally using the gene delivery vehicles 
described herein. The animals are bled two weeks after each immunization and an HIV 
antibody ELISA is performed with isolated plasma. The ELISA is performed 
essentially as described above except the second antibody-conjugate is an anti-human 

25 IgG, g-chain specific, peroxidase conjugate (Sigma Chemical Co., St. Louis, MD 
63178) used at a dilution of 1:500. Fifty |ig/ml yeast extract may be added to the 
dilutions of plasma samples and antibody conjugate to reduce non-specific backgroimd 
due to preexisting yeast antibodies in the baboons. Lymphoproliferative responses to 
are observed in baboons two weeks post-fourth immunization (at week 14), and 

30 enhanced substantially post-boosting with HIV-polypeptide (at week 44 and 76). Such 
proliferation results are indicative of induction of T-helper cell ftinctions. 
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B. Rhesus Macaques 

The improved potency of the synthetic, codon-modijHed flTV-polypeptide 
encodmg polynucleotides of the present invention, when constructed into expression 
plasmids may be confirmed in rhesus macaques. Typically, the macaques have 
5 detectable HTV-specific CTL after two or three 1 mg doses of modified HIV 
polynucleotide. In sum, these results demonstrate that the synthetic HIV DNA is 
immunogenic in non-human primates. Neutralizing antibodies may also detected. 

Example 7 

10 Co-Transfection of Monocistronic and Multicistronic Constructs 

The present invention includes co-transfection with multiple, monocistronic 
expression cassettes, as well as, co-transfection with one or more multi-cistronic 
expression cassettes, or combmations thereof. 

Such constructs, in a variety of combinations, may be transfected into 293T 
1 5 cells for transient transfection studies. 

For example, a bicistronic construct may be made where the coding sequences 
for the different HIV polypeptides are under the control of a single CMV promoter 
and, between the two coding sequences, an IRES (internal ribosome entry site (EMCV 
IRES); Kozak, M,, Critical Reviews in Biochemistry and Molecular Biology 
20 27(45):385-402, 1992; Witherell, G.W., et al., Virology 214:660-663, 1995) sequence 
is introduced after the first HTV coding sequence and before the second HTV coding 
sequence. 

Supernatants collected from ceD culture are tested for the presence of the HIV 
proteins and indicate that appropriate proteins are expressed in the transfected cells 
25 (e.g., if an Env coding sequence was present the corresponding Env protein was 
detected; if a Gag coding sequence was present the corresponding Gag protein was 
detected, etc). 

The production of chimeric VLPs by these cell lines may be determined using 
electron microscopic analysis. (See, e.g., co-owned WO 00/39302). 



116 



wo 03/004620 



PCT/US02/21420 



Example 8 

Accessory gene components for an HIV-1 vaccine: functional analysis of mutated Tat, 

Rev and Nef Type C antigens 
The HIV-1 regulatory and accessory genes have received increased attention as 

S conq}onents of HIV vaccines due to their role in viral pathogenesis, the high ratio of 
highly conserved CTL epitopes and their early expression in the viral life cycle. 
Because of various undesirable properties of these genes, questions regarding their 
safety and suitability as vaccine con^nents have been raised. Experiments performed 
in support of the present invention have analyzed candidate HIV-1 subtype C tat, rev, 

1 0 and nef mutants for efficient expression and inactivation of potential deleterious 
functions. Other HIV subtype accessory genes may be evaluated similarly. 

Sequence-modified, mutant tat^ r^v, and n^/ genes coding for consensus Tat, 
Rev and Nef proteins of South African HTV-l subtype C were constructed using 
overlapping synthetic oligonucleotides and PCR-based site-directed mutagenesis, 

15 Constructs of the wild-type genes of the isolates closely resembling the respective 
consensus sequences were also made by PGR. In vitro expression of the constructs 
was analyzed by western blotting. The fran^-activation activity of the Tat mutants and 
nuclear RNA export activity of the Rev mutants were studied after transfection of 
various cell lines using reporter-gene-based functionality assays. 

20 In vitro expression of all constructs was demonstrated by western blotting 

using antigen specific mouse serum generated by DNA vaccination of mice with Tat, 
Rev, or Nef-expression plasmids. Expression levels of the sequence-modified genes 
were significantly higher than the wild-type genes. 

Subtype B and C Tat cDNA was mutated to get TatC22, TatC37, and 

25 TatC22/37. Tat activity assays in three cell lines (RD, HeLa and 293). In the 

background of the subtype C consensus Tat, a single mutation at C22 was insufficient 
to inactivate LTR-dependent CAT expression. In contrast, this activity was 
significantly impaired in RD, 293 and HeLa cells using the single mutation, C37, or the 
double mutation, C22C37 (see Table B). Corresponding results were obtained for Tat 

30 mutants derived from subtype B strains. 
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Exeniplary results are presented in Figure 4 for transactivation activity of Tat 
mutants on LTR-CAT plasmid in 293 cells. Three independent assays were performed 
for each construct (Figure 4, legend (1), (2), (3)). 

The subtype C constructs TatC22ProtRTTatRevNef and 
5 ProtRTTatC22RevNef showed reduced Tat activity when compared to TatC22 alone, 
probably due to structural changes caused by the fusion protein. 

For Rev constructs, to test for the loss of function, a CAT assay with a 
reporter plasmid including native or mutated Rev was used. As shown in Figure 5, 
compared to wild-type Rev, the mRNA export function of the subtype C Rev with a 
10 double mutation, M5M10 (see Table B), was significantly lower. The background 
levels are shown in the **niock" data and the pbM128 reporter plasmid without Rev 
data. Two independent assays were performed for each construct (Figure 5, legend 
(1), (2)). 

Assays to measure Nef-specific functions may also be performed (Nef 
15 mutations are described in Table B). For example, FACs analysis is used to look for 
the presence of MHCl and CD4 on cell surfaces. Cells are assayed in the presence 
and absence of Nef expression (for controls), as well as using the synthetic 
polynucleotides of the present invention that encode native nef protein and mutated nef 
protein. Down-regulation of MHCl and CD4 expression indicates that the nef gene 
20 product is not functional, i.e., if nef is non-functional there is no down regulation. 

These data demonstrate the impaired functionality of tat and rev DNA 
immunogens that may form part of a multi-component HIV-1 subtype C vaccine. In 
contrast to previous published data by other groups, the C22 mutation did not 
sufficiently inactivate the transactivation function of Tat. The C37 mutation appeared 
25 to be required for inactivation of subtype C and subtype B Tat proteins. 

Example 9 

Evaluation of immunogenicitv of various HIV polvpeptide encodinp plasmids 
As noted above, the immunogenicity of any of the polynucleotides or 
30 expression cassettes described herein is readily evaluated. In the following table (Table 
D) are exemplified procedures involving a comparison of the immunogenicity of 
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subtype B and C envelope plasmids, both individually and as a mixed-subtype vaccine, 
using electroporation, in rabbits. It will be apparent that such methods are equaUy 
applicable to any other HIV polypeptide. 

5 Table D 
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The MF59C adjuvant is a microfluidized emulsion containing 5% squalene. 



10 0.5% Tween 80, 0.5% span 85, in lOmM citrate pH 6, stored in lOmL aliquots at 4°C. 

Immunogens are prepared as described in the following table (Table E) for 
administration to animals in the various groups. Concentrations may vary from those 
described in the table, for example depending on the sequences and/or proteins being 
used. 

15 Table E 



Group 


Preparation 


1-9 


Immunization 1-3: pCMV and pSIN based plasmid DNA in Saline + 
Electroporation 

Subtype B and C plasmids will be provided frozen at a concentration of l.Om^ml 
in sterile 0.9% saline. Store at -80^C until use. Thaw DNA at room 
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10 



Group 


Preparation 


10-11 


temperature; the material should be clear or slightly opaque, with no particulate 
matter. Animals will be shaved prior to immunization, under sedation of Ix dose 
IP (by animal weight) of Ketamine-Xylazine (80mg/ml - 4mg/ml). Immunize 
each rabbit with 0.5ml DNA mixture per side (IM/Quadriceps), 1.0ml per 
animal. Follow the DNA injection with Electroporation using a 6-needle circular 
array with 1cm diameter, 1cm needle Iragth. Electroporation pulses were given 
at 20V/mm, SOms pulse length, 1 pulse/s. 

Immunization 3: Protein Immunization 

Proteins will be provided at O.lmg/ml in citrate buffer. Store at -80**C until use. 
Thaw at room temperature; material should be clear with no particulate matter. 
Add equal volume of MF59C adjuvant to thawed protein and mix well by 
inverting the tube. Immunize each rabbit with 0.5ml adjuvanted protein per side, 
IM/Glut for a total of l.Oml per animal. Use matmal within 1 hour of the 
addition of adjuvant. 

Immunization 1-3: Combined subtype B and C plasmid DNA in Saline 
The hnmunogai will be provided at 2.0mg/ml total DNA (Ina^ml of each 
plasmid) in sterile 0.9% saline. Store at -80°C until use. Thaw DNA at room 
temperature; the material should be clear or slightly opaque, with no particulate 
matter. Animals will be shaved prior to immunization, under sedation of Ix dose 
JP (by animal weight) of Ketamine-Xylazine (80mg/ml - 4mg/ml). Immunize each 
rabbit with 0.5ml DNA mixture per side (IM/Quadriceps), l.Oml per animal. 
Follow the DNA injection with Electroporation using a 6'needle circular array 
with 1cm diameter, Icra needle length. Electroporation pulses were given at 
20V/mm, SOms pulse length, 1 pulse/s. 

Immunization 3: Protein Inununization 

Proteins will be provided at O.lmg/ml in citrate buffer. Store at -80*'C until use. 
Thaw at room temperature; material should be clear with no particulate matter. 
Add equal volume of MF59C adjuvant to thawed protein and mbc well by 
inverting the tube. Immunize each rabbit with 0.5ml adjuvanted protein per side, 
IM/Glut for a total of l.Oml per animal. Use material within 1 hour of the 
addition of adjuvant. 



20 

The inununization (Table F) and bleeding (Table G) schedules are as follows: 
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Example 10 

Mice Immunization Studies with Gag and Pol Constructs 
Cellular and Humoral immune responses were evaluated in mice (essentially as 
described in Example 4) for the following constructs: Gag, GagProtease(+FS) (GPl, 
5 protease codon optimized and inactivation of INS; GP2, protease only inactivation of 
INS), GagPolAintegrase with frameshift (gagFSpol), and GagPolAintegrase in-frame 
(GagPol) (see Figure 118). Versions of GagPolAintegrase in-£rame were also 
designed with attenuated (GagPolAtt) or non-functional Protease (GagPolIna). 

In vitro expression data showed comparable expression of p55Gag and p66RT 
10 using Gag alone, GagProtease(+FS), GagFSpol and GagPoHna, Constructs with fully 
functional or attenuated protease (GagPol or GagPolAtt) were less efficient in 
expression of p55Gag and p66RT, possibly due to cytotoxic effects of protease. 

DNA immunization of mice using Gag vs. GPl and GP2 in pCMV vectors was 
performed intramuscularly in the tibialis anterior. Mice were immunized at the start of 
15 the study (0 week) and 4 weeks later. Bleeds were performed at 0, 4, and 6 weeks. 
DNA doses used were as follows: 20 \ig, 2 \ig, 0.2 fig, and 0.02 fig. 

DNA immunization of mice using Gag vs. gagPSpol in pCMV vectors was 
performed intramuscularly in the tibialis anterior. Mice were mimunized at the start of 
the study (0 week) and challenged 4 weeks later with recombinant vaccinia virus 
20 encoding Gag (rVVgag). Bleeds were performed at 0 and 4 weeks. DNA doses used 
were as follows: 20 fig, 2 fig, 0.2 fig, and 0.02 fig. 

DNA immunization of mice using Gag vs. gagFSpol and gagpol in pCMV 
vectors was performed intramuscularly in the tibialis anterior. Mice were immunized 
at the start of the study (0 week) and challenged 4 weeks later with recombinant 
25 vaccinia virus encoding Gag (rWgag). Bleeds were performed at 0 and 4 weeks. 
DNA doses used were as follows: 2 jig, 0.2 fig, 0.02 ng, and 0.002 jig. 

Cellular unmune responses against Gag were comparable for aU tested variants, 
for example, Gag, GagProtease, gagFSpol and GagPoHna all had comparable 
potencies. 

30 Humoral immune responses to Gag were also comparable with the exception of 

GP2 and especially GPl . Humoral immune responses were weaker in constructs 
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comprising functional or attenuated proteases which may be due to less efficient 
secretion of p55Gag caused by overactive protease. 

In vitro and in vivo experiments, performed in support of the present invention, 
suggest that the expression and immunogenicity of Gag was comparable with aU 
S constructs. Exceptions were GagPol in-frame with fiiUy functional or attenuated 
protease. This may be the result of cytotoxic effects of protease. The immune 
response in mice correlated with relative levels of expression in vitro. 



Example 1 1 

10 Protein Expression, Immunoeenicitv. and Generation of Neutralizing Antibodies Using 

Tvpe C Derived Envelope Polvpeptides 
Envelope (Env) vaccines derived from the subtype C primary isolate, TVl, 
recovered from a South African individual, were tested in rabbits as follows. Gene 
cassettes were designed to express the gpl20 (surface antigen), gpl40 (surface antigen 

15 plus ectodomain of transmembrane protein, gp41), and full-length (gpl20 plus gp41) 
gpl60 forms of the HTV-l envelope polyprotein with and without deletions of the 
variable loop regions, V2 and VI V2. All of the genes were sequence-modified to 
enhance expression of the encoded Env glycoproteins in a Rev-independent fashion 
and they were subsequently cloned into pCMV-based plasmid vectors for DNA 

20 vaccine and protein production applications as described above. The sequences were 
codon optinoized as described herein. Briefly, all the modified envelope genes were 
cloned into the Chiron pCMVlink plasmid vector, preferably into EcoRI/XhoI sites. 



A. Protein Expression 

25 Full-length (gpl60), truncated gpl40 (Env ectodomain only) and gpl20 native 

versions of the TVl Env antigen were produced from the expression cassettes 
described hereto. The gpl40 encodmg sequences were transiently transfected toto 
293T cells. The expression levels of the gene products were evaluated by an to*house 
antigen capture EUS A. Envelope genes constructed from the native sequences of 

30 TV001c8.2, TV001c8.5 and TV002cl2. 1 expressed the correct protetos to vitro, with 
gpl40TV001c8.2 exhibittog the highest level of expression. In addition, the Env 
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protein expressed from the TVl-derived clone 8.2 was found to bind the CD4 receptor 
protein indicating that this feature of the expressed protein is maintained in a functional 
conformation. The receptor binding properties/functionality of the expressed TVl 
gp 1 60 protein result was also confirmed by a cell-fusion assay. 
5 Total expression increased approximately 10-fold for synthetic gpl40 

constructs compared with the native gpl40 gene cassettes. Both the modified gpl20 
and gpl40 variants secreted high amounts of protein in the supernatant. In addition, 
the V2 and VI V2 deleted forms of gpl40 expressed approximately 2-fold more 
protein than the intact gpl40. Overall, the expression levels of synthetic gpl40 gene 

10 variants increased 10 to 26-fold conq)ared with the gpl40 gene with native sequences. 

In sum, each synthetic construct tested showed more than 10-fold increased 
levels of expression relative to those using the native coding sequences. Moreover, all 
expressed proteins were of the expected molecular weights and were shown to bind 
CD4. Stable CHO ceU lines were derived and small-scale protein purification naethods 

15 were used to produce small quantities of each of the undeleted and V-deleted 
oligomeric forms (o-gpl40) of these proteins for vaccine studies. 

B. Neutralization properties of TVOOl and TV002 viral isolates 

The transient expression experiment showed that the envelope genes derived 

20 from the TVOOl and TV002 virus isolates expressed the desired protein products. 
Relative neutralization sensitivities of these two viral strains using sera from 18 
infected South African individuals (subtypes B and C) were as follows. At a 1:10 
serum dilution, the TV2 strain was neutralized by 18 of 18 sera; at 1:50, 16 of 18; at 
1:250, 15/18. In con5)arison, the TVl isolate was neutralized by 15 of 18 at 1:10; 

25 only 6 of 1 8 at 1 :50; and none of the specimens at 1 :250. In addition, the TVOOl 

patient serum showed neutralization activity against the TV002 isolate at all dilutions 
tested. In contrast, the TV002 showed neutralization of TVOOl only at the 1 : 10 serum 
dUution. These results suggest that TVOOl isolate is capable of inducing a broader and 
more potent neutralizing antibody response in its infected host than TV002. 
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C> Immunogenicitv of the modified TVl Env DNA and protein antigens in 
rabbit studies 

TVl Env DNA (comprising the synthetic expression cassettes) and protein 
vaccines were adniinistrated as shown in the following Table H. 



Table H 



Groups 


Plasmid DNA (0, 4, and 20 wks) 


Protein boost (20 wks) 


1 


pCMVgp160TV1 


o-gp140.TV1 


2 


pCMVgp160dV2.TV1 


o-gp140dV2.TV1 


3 


pCMVgp160dV1V2.TV1 


o-gp140dV1V2.TV1 


4 


pCMVgp140.TV1 


o-gp140.TV1 


5 


pCMVgp140dV2.TV1 


o-gp140dV2.TV1 


6 


pCMVgp140dV1V2.TV1 


o-gp140dV1V2.TV1 


7 


pCMVgp140dV2.SF162 


o-gp140dV2.SF162 



Seven groups of 4 rabbits per group were immunized with the designated 
plasmid DNA and oligomeric Env protein antigens. Three doses of DNA, Img of 
DNA per animal per immunization, were administrated intramuscularly by needle 
injection followed by electroporation on weeks 0, 4, and 20 weeks. A single dose of 
100 ug of Env protein in MF59 adjuvant also was given intramuscularly in a separate 
site at 20 weeks. 

The DNA immunization used subtype C sequence-modified genes (TVl) - 
gpl60, gpl60dV2, gpl60dVlV2, gpl40, gpl40dV2 and gpl40dVlV2 - as well as a 
subtype B SF162 sequence modified gpl40dV2. DNA immunizations were 
performed at 0, 4, and 20 weeks by needle injection by the intramuscular route using 
electroporation to facilitate transfection of the muscle cells and of resident antigen 
presenting cells. 

A single Env protein booster (in MF59 adjuvant) was given at 20 weeks by 
intramuscular injection at a separate site. Antibody titers were evaluated by ELIS A 
following each successive immunization. Serum specimens were collected at 0, 4, 6, 8, 
12, 22, and 24 weeks. Serum antibody titers were measured on EUSA. 96-well plates 
were coated with a protein in a concentration of lug/ml. Serum samples were diluted 
serially 3-fold. Goat anti-rabbit peroxidase conjugate (1:20,000) was used for 
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detection. TMB was used as the substrate, and the antibody titers were lead at 0.6 OD 
at 450nm 

Neutralizing antibody responses against PBMC-grown R5 HTV-l strains were 
monitored in the sera collected from the immunized rabbits using two different assays 
S in two different laboratories, the 525 reporter cell-line based assay at Chiron and the 
PBMC-based assay of David Montefiori at Duke University. Results are shown in 
Figures 121, 122, and 123, The Chiron assay was conducted essentially as follows. 
Neutralizing antibody responses against the PBMC-grown subtype C TVOOl and 
TV002 strains were measured using an in-house reporter cell line assay that uses the 

10 5.25 cell line. This cell has CD4, CCR5, CXCR4 and BONZO receptor/co-receptors 
on its cell membrane. The parental CEM cell line was derived from a 4-year-old 
Caucasian female with acute lymphoblastic leukemia, which was fused with the human 
B cell line 721.174, creating CEMxl74. LTR-GFP was transfected into the cells after 
the CCR5 gene (about 1.1 kb) was cloned into the BamH-I (5*) and Sal-I (3') of the 

1 5 pB ABE pure retroviral vector, and subsequently introduced into the CEMxl74. The 
green fluorescence protein (GFP) of the cells was detected by flow cytometer 
(FAGS can). For the virus neutralization assay, 50 ul of titrated virus and 50 ul of 
diluted immune or pre-immune serum were incubated at room temperature for one 
hour. This mixture was added into wells with lOVml cells plated in a 24 well plate, and 

20 incubated at 37°C for 5 to 7 days. The cells were then fixed with 2% of formaldehyde 
after washing with PBS. Fifteen thousand events (cells) were collected for each sample 
on a Becton Dickinson FACScan using Cellquest software. The data presented were 
the mean of the triplicate weUs. The percent neutralization was calculated compared to 
the virus control using the following equation: % virus Inhibition = (virus control- 

25 experimentaI)/( virus control -cell control) x 100. Any virus inhibition observed in the 
pre-bleed has been subtracted for each individual animal Values >50% are considered 
positive and are highlighted in gray. 

In Figure 122, the indicates that animals had high levels of virus inhibition 
in pre-bleed serum (>20% virus inhibition) that ixxspacttd the magnitude of the 

30 observed inhibition and in some cases, our ability to score the serum as a positive or 
negative for the presence of significant neutralizing antibody activity (< 50% 
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inhibition). 

For the data presented in Figure 123, serum samples were collected after a 
single protein boost (post-third) were screened in triplicate at a 1:8 dilution with virus 
(1 :24 after addition of ceUs). Values shown are the % reduction in p24 synthesis 
5 relative to that in the corresponding pre-bleed control samples. Zero values indicate 
no or negative values were measured. NV, not valid due to virus inhibition in pre- 
immune serum. Neutralization was considered positive when p24 was reduced by at 
least 80%; these sanqples are highlighted in dark gray. Sanq)le with lighter gray 
shading showed at least a 50% reduction in p24 synthesis. 

10 Figure 1 19 shows the ELISA data when plates were coated with the 

monomeric gpl20.TVl protein. This protein is homologous to the subtype C genes 
used for the immunization. All immunization groups produced high antibody titers 
after the second DNA immunization. The groups immunized with gpl40 forms of 
DN A have relatively higher geometric mean antibody titers as compared to the groups 

15 using gpl60 forms after both first and second DNA immunizations. Both the 

gpMO.TVl and gpl40dVlV2.TVl genes produced high antibody titers at about 10* at 
two weeks post second DNA; the gpl40dV2.TVl plasmid yielded the highest titers of 
antibodies (>10'^) at this time point and all others.. The binding antibody titers to the 
gpl20.TVl protein were higher for the group immunized with the homologous 

20 gpl40dV2.TVl genes than that with the heterologous gpl40dV2.SF162 gene which 
showed titers of about 10^ All the groups, showed some decline in antibody titers by 8 
weeks post the second DNA immunization. Following the DNA plus protein booster 
at 20 weeks, all groups reached titers above that previously observed after the second 
DNA immunization (0. 5 -1.0 log increases were observed). After the protein boost, 

25 all animals receiving the o-gpl40dV2.TVl protein whether primed by the 
gpl40dV2.TVl or gpl60dV2.TVl DNA, showed the highest Ab titers. 

Binding antibody titers were also measured usiag ELISA plates coated with 
either oligomeric subtype C o-gpl40dV2.TVl or subtype B o-gpl40dV2.SF162 
proteins (Figure 120). For aU the TVl Env immunized groups, the antibody titers 

30 measured using the oligomeric protein, o-gpl40dV2.TVl were higher than those 
measured using the monomeric (non-V2-deleted) protein, gpl20.TVl. In fact, for 
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these groups, the titers observed with the heterologous subtype B o-gpl40dV2.SF162 
protein were comparable to or greater than those measured with the subtype C TVl 
gpl20. Nevertheless, aU groups immunized with subtype C immunogens showed 
higher titers binding to the subtype C o-gpl40dV2.TVl protein than to the subtype B 
5 protein gpl40dV2.SF162. Conversely, the group immunized with the 

gpl40dV2.SF162 immunogen showed higher antibody titers with the oligomeric 
subtype B protein relative its subtype C counterpart. Overall, all three assays 
demonstrated that high antibody cross-reactive antibodies were generated by the 
subtype CTVl -based DNA and protein immunogens. 

10 The results indicate that the subtype C TVl -derived Env DNA and protein 

antigens are immunogenic inducing high titers of antibodies in inomunized rabbits and 
substantial evidence of neutralizing antibodies against both subtype B and subtype C 
R5 vims strains. In particular, the gpl40dV2.TVl antigens have induced consistent 
neutralizing responses against the subtype B SF162EnvDV2 and subtype C TV2 

15 strains. Thus, TVl -based Env DNA and protein-based antigens are immunogenic and 
induce high titer antibody responses reactive with both subtype C and subtype B HTV- 
1 Env antigens. Neutralizing antibody responses against the neutralization sensitive 
subtype B R5 HIV-1sp,62dv2 strain were observed in some groups after only two DNA 
immunizations. Following a single booster immunization with Env protein, the 

20 majority of rabbits in groups that received V2-deleted forms of the TVl Env showed 
neutralization activity against the closely related subtype C TV2 primary strain. 

Example 12 

Immunological Responses in Rhesus Macaques 
25 Cellular and humoral immune responses were evaluated in three groups of 

rhesus macaques (each group was made up of four animals) in an immunization study 

structured as shown in Table I. The route of administration for the immunizing 

composition was electroporation in each case. Antibody titers are shown in Table I for 

two weeks post-second immunization. 

30 
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Group 


Formulation of 
Immunizing 
Composition * 


AniiDal# 


Titer 


1 


pCMVgag (3.5 


A 


3,325 




mg) 4- pCMVenv 
(2.0 mg) 


B 


4,000 




C (previously 
immunized with 
HCV core 
ISCOMS, rWC 
core El) 


1,838 






D (previously 
immunized with 
HCV core 

lO V^vylVJ.0 , 1 V V 

core El) 


1,850 


2 


pCMVgag (3.5 
mg) + pCMVpol 
(4.2 mg) 


A (previously 
immunized with 
HCV core 
ISCOMS, rWC 
core El, 

p55gagLAi(VLP)) 


525 






B 


5,313 






C 


6,450 






D 


5,713 


3 


pCMVgag-pol 
(5.0 mg) 


A (previously 
immunized with 
HCV core 
ISCOMS, rVVC 
core El, 

LI\««XTJL T gCiKwX £*} 


0 






B Toreviouslv 
immunized with 
rWC/El, pCMV 
Epo-Epi, 
HIV/HCV-VLP. 
pCMVgagSF2, 
pUCgpl20 SF2) 


1 063 






C 


513 
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Group 


Fonnulation of 
Immunizing 
Composition * 


Ajiimal# 


Titer 






D (previously 
immunized with 
rWC/El, 
mV/HCV-VLP) 


713 



* pCMVgag = pCMVKm2.GagMod Type C Botswana 
pCMVenv = pCMVLink.gpl40env.dV2.TVl (Type Q 
pCMVpol = pCMVKm2.p2Pol.mut.Ina Type C Botswana 
pCMVgag-pol = pCMVKm2.gagCpoLmut.Ina Type C Botswana 



5 

Pre-immune sera were obtained at week 0 before the first immunization. The 
first immunization was given at week 0. The second immunization was given at week 
4. The first bleed was performed at 2 weeks post-second immunization (i.e., at week 
6). A third immunization will be given at week 8 and a fourth at week 16. Animals 

10 2A, 3 A, 3B and 3D had been vaccinated previously (approximately 4 years or more) 
with gag plasmid DNA or gag VLP (subtype B), 

Bulk CTL, ^^Cr-release assays, and flow cell cyton^try methods were used to 
obtain the data in Tables J and K. Reagents used for detecting gag- and pol-specific 
T-cells were (i) synthetic, overlapping peptides spanning "gagCpol" antigen (n=:377), 

15 typically the peptides were pools of 15-mers with overlap by 1 1, the pools were as 
follows, pool 1, n=l-82, pool 2, n=83-164, pool 3. n=165-271, pool 4, n=272-377, 
accordingly pools 1 and 2 are "gag"-specific, and pools 3 and 4 are **por -specific, and 
(ii) recombinant vaccinia virus (rW), for example, rWgag965, rWp2Pol975 
(contains p2p7gag975), and W„j)arent. 

20 Gag-specific IFNy + CDS + T-cells, Gag-specific IFNy + CD4 + T-cells, Pol- 

specific JFNy + CDS + T-cells, and Pol-specific IFN7 + CD4 + T-cells in blood were 
determined for each animal described in Table I above, post second immunization. 
The results are presented in Tables J and K. It is possible that some of the pol-specific 
activity shown in Table K was directed against p2p7gag. 

25 

Table J 
Gag Assay Results 
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Grou 
p/Ani 
mal 


Immun- 
izing 
Compo- 
sition 


Gag Specific CD4+ Responses 


Gag Specific CD8+ 
Responses 


LPA(SI) 


Flow 


CTL 


Flow 


p55 


Pool 1 


Pool 2 


IFNg+ 


Pool 1 


Pool 2 


IFNg+ 


lA 


pCMVgag 
pCMVenv 


3.3 


5.9 


3.8 


496 


minus 


minus 


225 


IB 


pCMVgag 
pCMVenv 


11.8 


4.4 


1.5 


786 


minus 


minus 


160 

X \J\J 


IC 


pCMVgag 
pCMVenv 


5.7 


1.1 


2.4 


361 


plus 


plus 


715 


ID 


pCMVgag 
pCMVenv 


6.5 


3.1 


1,6 


500 


plus 


? 


5Q6 


2A 


pCMVgag 
pCMVpol 


4.8 


4.8 


1.6 


405 


plus 


minus 


1 A'Kfk 
i iOO 


2B 


pCMVgas 
pCMVpol 


12.5 


6.8 


3.3 




plus 


minus 




2C 


oCMVeae 
pCMVpol 


6.0 


3.8 


2.1 


776 


minus 


minus 


0 


2D 


pCMVgag 
pCMVpol 


18.9 


13.5 


5.4 


1351 


minus 


minus 


M5 


3A 


pCMV 
gagpol 


12.2 


7.0 


1.5 


560 


plus 


plus 


3595 


3B 


pCMV 
gagpol 


2.7 


5.6 


1.3 


508 


plus 


? 


3256 


3C 


pCMV 
gagpol 


11.6 


5.0 


1.2 


289 


minus 


? 


617 


3D 


pCMV 
gagpol 


1.5 


1.2 


1.4 


120 


minus 


minus 


277 



? = might be positive on rVVp2Pol. 
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Table K 
Pol Assay Results 



Group 

Anima 
I 


Immun- 
izing 
Compo- 
sition 


Pol Specific CD4+ Response 


Pol Specijfic CD8+ 
Responses 


LPA(SI) 


Flow 


CTL 


Flow 


Pool 3 


Pool 4 


lrlNg+ 


Pool 3 


Pool 4 


irJNg+ 


lA 


pCMVgag 
pCMVenv 


1 


1.2 


0 


minus 


minus 


0 


IB 


pCMVgag 
pCMVenv 


1 


1 


0 


minus 


minus 


0 


IC 


pCMVgag 
pCMVenv 


1 


1.1 


0 


minus 


minus 


0 


ID 


pCMVgag 
pCMVenv 


1.2 


1.3 


0 


minus 


minus 


262 


2A 


pCMVgag 
pCMVpol 


1.1 


0.9 


92 


minus 


minus 


459 


2B 


pCMVgag 
pCMVpol 


2.5 


1.8 


107 


minus 


minus 


838 


2C 


pCMVgag 
pCMVpol 


1.2 


1.1 


52 


plus 


minus 


580 


2D 


pCMVgag 
pCMVpol 


2.5 


2.7 


113 


plus 


plus 


5084 . 


3A 


pCMV 
gagpol 


2.7 


2.4 


498 


minus 


minus 


3631 


3B 


pCMV 

gagpol 


1.1 


1 


299 


minus 


minus 


1346 


3C 


pCMV 
gagpol 


2.1 


1.4 


369 


minus 


minus 


399 


3D 


pCMV 
gagpol 


1.3 


1.8 


75 


minus 


minus 


510 



These results support that the constructs of the present invention are capable of 
generating specific cellular and humoral responses against the selected HIV- 
polypeptide antigens. 
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Although preferred embodiments of the subject invention have been described 
in some detail, it is understood that obvious variations can be made without departing 
from the spirit and the scope of the invention as defined by the appended clainas. 
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What is claimed is: 

1. An expression cassette, comprising a polynucleotide sequence encoding a 
polypeptide including an HIV Gag polypeptide, wherein the polynucleotide sequence 
encoding said Gag polypeptide comprises a sequence having at least 90% sequence 

5 identity to a sequence selected from the group consisting of SEQ ID N0:9, SEQ ID 
NO: 1 0, SEQ ID NO: 11 , SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15. SEQ ID 
NO: 16, SEQ ID NO: 17, SEQ ID N0:18 and SEQ ID NO: 19. 

2. An expression cassette, comprising a polynucleotide sequence encoding a 

10 polypeptide including an HIV Gag polypeptide, wherein the polynucleotide sequence 
encoding said Gag polypeptide comprises a sequence having at least 90% sequence 
identity to at least 500 contiguous nucleotides of SEQ ID NO: 12 or SEQ ID NO:20. 

3. An expression cassette, comprising a polynucleotide sequence encoding a 

1 5 polypeptide including an HIV Env polypeptide, wherein the polynucleotide sequence 
encoding said Env polypeptide corrq)rises a sequence having at least 90% sequence 
identity to SEQ ID NO:21 , SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID 
NO:25, SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO:28, SEQ ID NO:29 and SEQ 
IDNO:30. 

20 

4. An expression cassette, comprising a polynucleotide sequence encoding a 
polypeptide including an HTV Env polypeptide, wherein the polynucleotide sequence 
encoding said Env polypeptide comprises a sequence havmg at least 90% sequence 
identity to SEQ ID NO:30, SEQ ID N0:3 1, SEQ ID NO:32, SEQ ID NO:33, SEQ ID 

25 NO:34, SEQ ID NO:35, SEQ ID NO:36, SEQ ID NO:37, and SEQ ID NO:38. 

5. An expression cassette, comprising a polynucleotide sequence encoding a 
polypeptide including an HIV Int polypeptide, wherein the polynucleotide sequence 
encoding said Int polypeptide coiiq}rises a sequence having at least 95% sequence 

30 identity to SEQ ID NO:39. 



136 



wo 03/004620 



PCTAJS02/21420 



6. An expression cassette, con5)rismg a polynucleotide sequence encoding a 
polypeptide including an HIV Int polypeptide, wherein the polynucleotide sequence 
encoding said Int polypeptide comprises a sequence having at least 98% sequence 
identity to SEQ ID NO:40. 

5 

7. An expression cassette, comprising a polynucleotide sequence encoding a 
polypeptide including an HIV iVle/ polypeptide, wherein the polynucleotide sequence 
encoding said iVi^ polypeptide con^rises a sequence having at least 90% sequence 
identity to SEQ ID NO:41 or SEQ ID NO:203. 

10 

8. An expression cassette, comprising a polynucleotide sequence encoding a 
polypeptide including an HTV plSRNaseH polypeptide, wherein the polynucleotide 
sequence encoding said plSRNaseH polypeptide comprises a sequence having at least 
90% sequence identity to SEQ ID NO:42. 

15 

9. An expression cassette, comprising a polynucleotide sequence encoding a 
polypeptide including an HTV Pol polypeptide, wherein the polynucleotide sequence 
encoding said Pol polypeptide comprises a sequence having at least 95% sequence 
identity to a sequence selected from the group consisting of SEQ ID NO:43, SEQ ID 

20 NO:44 and SEQ ID NO:45. 

10. An expression cassette, comprising a polynucleotide sequence encoding a 
polypeptide including an HTV Tat polypeptide, wherein the polynucleotide sequence 
encoding said Tat polypeptide comprises a sequence having at least 90% sequence 

25 identity to a sequence selected from the group consisting of SEQ ID NO:46, SEQ ID 
NO:47andSEQIDNO:48. 

1 1 . An expression cassette, comprising a polynucleotide sequence encoding a 
polypeptide including an HIV Prat polypeptide, wherein the polynucleotide sequence 

30 encoding said Prot polypeptide comprises a sequence having at least 95% sequence 
identity to SEQ ID NO:49 or SEQ ID NO:50. 
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12. An expression cassette, comprising a polynucleotide sequence encoding a 
polypeptide including an HIV Prot polypeptide, wherein the polynucleotide sequence 
encoding said Prot polypeptide comprises a sequence having at least 90% sequence 
identity to SEQ ID N0:51. 

5 

13. An expression cassette, comprising a polynucleotide sequence encoding a 
polypeptide including an HTV Rev polypeptide, wherein the polynucleotide sequence 
encoding said Rev polypeptide comprises a sequence having at least 90% sequence 
identity to SEQ ID NO:52. 

10 

14. An expression cassette, comprising a polynucleotide sequence encoding a 
polypeptide including an HIV Tat polypeptide, wherein the polynucleotide sequence 
encoding said Tat polypeptide comprises a sequence having, at least 90% sequence 
identity to a sequence selected from the group consisting of SEQ ID NO:53, SEQ ID 

15 NO:54, SEQ ED NO:55, SEQ ID NO:56, SEQ ID NO:57, SEQ ID NO:58, SEQ ID 
NO:59, and SEQ ID NO:60. 

15. An expression cassette, comprising a polynucleotide sequence encoding a 
polypeptide including an HTV Env polypeptide, wherein the polynucleotide sequence 

20 encoding said Env polypeptide comprises a sequence having at least 90% sequence 

identity to a sequence selected from the group consisting of SEQ ID NO: 183, SEQ ID 
NO:184, SEQ ID NO:185, SEQ ID NO:186, SEQ ID NO:187, SEQ ID NO:188, SEQ 
ID NO:189, SEQ ID NO:190 and SEQ ID NO:191. 

25 1 6. A recombinant expression system for use in a selected host cell, comprising, an 
expression cassette of any of claims 1 to 15, and wherein said polynucleotide sequence 
is operably linked to control elements coxnpatMc with expression in the selected host 
ceO. 

30 17. The recombinant expression system of claim 16, wherein said control elements are 
selected from the group consisting of a transcription promoter, a transcription 
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enhancer element, a transcription termination signal, polyadenylation sequences, 
sequences for optimization of initiation of translation, and translation termination 
sequences. 

5 1 8 . The recombinant expression system of claim 16, wherein said transcription 

promoter is selected from the group consisting of CMV, CMV+faitron A, SV40, RSV, 
HIV-Ltr, MMLV-ltr, and metallothionein. 

19. A cell comprising an expression cassette of any of claims 1 to IS, and wherein said 
10 polynucleotide sequence is operably linked to control elements compatible with 

expression in the selected celL 

20. The cell of claim 19, wherein the cell is a mammalian cell. 

15 21. The cell of claim 20, wherein the cell is selected from the group consisting of 
BHK, VERO, HT1080, 293, RD, COS-7, and CHO cells. 

22. The cell of claim 21, wherein said cell is a CHO cell. 

20 23. The ceD of claim 19, wherein the ceU is an insect cell, 

24. The cell of claim 23, wherein the cell is either Trichoplusia ni (Tn5) or Sf9 insect 
cells. 

25 25. The cell of claim 19, wherein the cell is a bacterial cell 

26. The cell of claim 19, wherein the cell is a yeast cell 

27. The cell of claim 19, wherein the cell is a plant cell. 

30 

28. The cell of claim 19, wherein the ceU is an antigen presenting celL 
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29. The cell of claim 28, wherein the antigen presenting cell is a lymphoid cell selected 
from the group consisting of macrophages, monocytes, dendritic cells, B-cells, T-cells, 
stem cells, and progenitor cells thereof. 

5 30. The cell ofclaun 19, wherein the cell is a primary cell. 

3 1 . The cell of claim 19, wherein the ceU is an immortalized cell. 

32. The cell of claim 19, wherein the cell is a tumor-derived cell. 

10 

33. A method for producing a polypeptide including HTV Gag polypeptide 
sequences, said method comprising, 

incubating the cells of claim 19, under conditions for producing said 
polypeptide. 

15 

34. A gene delivery vector for use in a mammalian subject, comprising 

a suitable gene delivery vector for use in said subject, wherein the vector 
comprises an expression cassette any of claims 1 to 15, and wherein said 
polynucleotide sequence is operably linked to control elements compatible with 
20 expression in the subject. 

35. A method of DNA unmunization of a subject, comprising, 

introducing a gene delivery vector of claim 34 into said subject under 
conditions that are con[q)atible with expression of said expression cassette in said 
25 subject. 

36. The method of claim 35, wherein said gene delivery vector is a nonviral vector. 

37. The method of claim 35, wherein said vector is delivered using a particulate 
30 carrier. 



140 



10 



wo 03/00462(» PCT/US02/21420 

38. The method of claim 37, wherein said vector is coated on a gold or tungsten 
particle and said coated particle is delivered to said subject using a gene gun. 

39. The method of claim 3S, wherein said vector is encapsulated in a liposome 
5 preparation. 

40. The method of claim 35, wherein said vector is a viral vector. 

4 1 . The method of claim 40, wherein said viral vector is a retroviral vector. 

42. The method of claim 40, wherein said viral vector is an alphaviral vector. 

43. The method of claim 40, wherein said viral vector is a lentiviral vector. 
1 5 44. The method of claim 35, wherein said subject is a manraial. 

45. The method of claim 44, wherein said mammal is a human. 

46. A method of generating an immune response in a subject, comprising 

20 transfecting cells of said subject a gene delivery vector of claim 34, under 

conditions thjat permit the expression of said polynucleotide and production of said 
polypeptide, thereby eliciting an immunological response to said polypeptide. 

47. The method of claim 46, wherein said vector is a nonviral vector. 

25 

48. The method of claim 46, wherein said vector is delivered using a particulate 
carrier. 

49. The method of claim 46, wherein said vector is coated on a gold or tungsten 
30 particle and said coated particle is delivered to said vertebrate cell using a gene gun. 



141 



wo 03/004620 PCT/US02/21420 

50. The method of claim 46, wherein said vector is encapsulated in a liposome 
preparation. 

5 1 . The method of claim 46, wherein said vector is a viral vector. 

5 

52. The method of claim 51, wherein said viral vector is a retroviral vector. 

53. The method of claim 51, wherein said viral vector is an alphaviral vector. 
10 54. The method of claim 5 1 , wherein said viral vector is a lentiviral vector. 

55. The method of claun 46, wherein said subject is a mammal. 

56. The method of claim 55, wherein said mammal is a human. 



15 



20 



30 



57. The method of claim 46, wherein said transfecting is done ex vivo and said 
transfected cells are reintroduced into said subject. 

58. The method of claim 46, wherein said transfecting is done in vivo in said subject. 

59. The method of claim 46, where said immune response is a humoral immune 
response. 



60. The method of claim 46, where said inmiune response is a cellular immune 
25 response. 

61. The method of claim 46, wherein the gene delivery vector is administered 
intramuscularly, intramucosally, intranasally, subcutaneously, intradermally, 
transdermally. intravaginally, intrarectally, orally or intravenously. 
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1 TGGAAGGGTT AATTTACTCC AAGAAAAGGC AAGAAATCCT TGATTTGTOG GTCTATCACA 
61 CACAAGGCTT CTTCCCTGAT TGGCAAAACT ACACACCX3GG GCCAGGGGTC AGATATCCAC 
121 TGACCTTTGG ATGGTGCTAC AAGCTAGTGC CAGTTGACCC AGGGGAGGTG GAAGAGGCCA 
181 AOGGAGGAGA AGACAACTGT TTGCTACACC CTATGAGCCA ACATQGAGCA GAGQATGAAG 
241 ATAGAQAAGT ATTAAAGTGG AAGTTTGACA GCCTCCTAGC AGGCAQACAC ATGGCCCGCG 

3 01 AGCTACATCC GGAGTATTAC AAAGACTGCT GACAGAQAAO OGACTTTOOG CCTGGOACTT 
t 361 TCCACTGGGG CX3TTCCGGGA GGTGTOGTCT GGGCGGGAGfT TGGGAQT6GT CAACXXTTCAG 

421 ATGCTGCATA TAAGCAGCTG CTTTTCGCCT GTACTGGGTC TCTCTCGGTA GACCAGATCT 

4 81 GAGCCTGGGA GCCCTCTGGC TATCTAGGGA ACCCACTGCT TAAGCCTCAA TAAAGCrtcc 
541 CTTGAGTGCT TTAAGTAGTG TGTGCCCATC TGTTGTGTGA CTCTGGTAAC TAGAGATCCC 
601 TCAGACCCTT TGTGGTAGTG TGGAAAATCT CTAGCAGTGG CGCCCGAACA GGGACCAGAA 
661 AGTGAAAGTG AGACCAGAGG AGATCTCTCG ACX3CAGGACT CX5GCTTGCTG AAGTGCACAC 
721 GGCAAGAGGC GAGAGGGGCG GCTGGTGAGT ACGCCAATTT TACTTGACTA GCGGAGGCTA 
7 81 GAAGGAGAGA GATGGGTGCG AGAGCGTCAA TATTAAGCGG CX3GAAAATTA GATAAATGGG 
841 AAAG/VATTAG GTTAAGGCCA GGGGGAAAGA AACATTATAT GTTAAAACAT CTAGTATCGG 
901 CAAGCAGGGA GCTGGAAAGA TTTGCACTTA ACCCTGGCCT GTTAGAAACA TCAGAAGGCT 
961 GTAAACAAAT AATAAAACAG CTACAACCAG CTCTTCAGAC AGGAACAGAG GAACTTAGAT 
1021 CATTATTCAA CACAGTAGCA ACTCTCTATT GTGTACATAA AGGGATAQAG GTACQAGACA 
1081 CCAAGGAAGC CTTAGACAAG ATAGAGGAAG AACAAAACAA ATGTCAGCAA AAAGCACAAC 
1141 AGGCAAAAGC AGCTGACGAA AAGGTCAGTC AAAATTATCC TATAQTACAG AATGCCCAAG 
1201 GGCAAATGGT ACACCAAGCT ATATCACCTA GAACATTGAA TGCATGGATA AAAGTAATAG 
1261 AGOAAAAGGC TTTCAATCCA GAGGAAATAC CCATGTTTAC AGCATTATGA GAAGGAGCCA 
1321 CXXXACAAGA TTTAAACACA ATGTTAAATA CAGTGGGGOG ACA*tGAAGGA GCCATOCAAA 
1381 TGTTAAAAGA TACCSeTCAAT GAGGAGGCTG CAGAATOGGA TAGGACACAT CCAGTACATG 
1441 CAGGGCCTGT TGC3VCC3VGGC CAQATOAQAG AACCAAQGGG AAGTGACATA GCAGGAACTA 
ISOl CTAGTACCCT TCAGGAACAA ATAGCATGGA TGACAAGTAA TCCACCTATT CXAGTAGAAG 
IS 61 ACATC TATAA AAGATGGATA ATTCTGGGGT TAAATAAAAT AGTAAGAATG TATAGCCCTG 
1^21 TTAGCArrTT GGACATAAAA CAAGGOCXAA AAGAACCXTr TAGA6ACTAT CTCAGACCGGT 
1681 TCTTTAAAAC CITAAGAGCT GAACAAOCTA CACAAGAT6T AAAGAATTGG ATQACAGACA 
1741 CCTTQTXGGT CCAAAATGCiO AACCX3V0ATT GTAAGAGCAT TTIAAQAGCA TXAGGACCAG 
1801 GGGCCTCATT AGAAGAAATG A3X3ACAGCAT GTCAGGGAOT OGGAQQACCT AGCCAJCAAAG 
1861 CMOm rCSTT GGCIOAGGCA ATdAGCCAAG GAAACA0TAA GAXACZAGTG CAGAQAAGCA 
X921 ATTTTAAAOS CTCXAACAGA ATXKTXAAAT GTrrCAACIG TGOaVAAOTA GOGCACIVaaO 
1961 OCAOAAAZTG CPlOOQCOCCT AfiOAAAAAGG GCXGRnXSGAA AXOXGGACAO GAAGGACACC 
2041 AAATOAAAGA CXCTACTGAG AGGGAGGCXA ATXTXTTA08 OAAAAXTIGa OCTTCXXavCA 
2101 AGGGGAGGCC AGGGAATTTC CXCCAGAACA GAOCAGAGCX: AACAGCCC!CA CCAGCAGAAC 
2161 CAACAGCXXX: ACCAGCAGAG AGCrTCAGGT TXXSAGOHGAC MOXXXXSTO CXX5AGGAAGG 
2221 AGAAAGAGAG GOAACCTTTA ACTTCCCXCA AATCRCTCTT TOQCAGCSQAC COCTOOTCXC 
2281 AATAAAA<3TA GAGGGCCAGA TAAAGGAGGC TCTCXTAOAC AGAGQAGGAO ATGAXACAGT 
2341 ATTAG AAGAA ATAOATTXGC GAGGGAAATC GAAACCAAAA AtGATAGQGO GAATXGGAGG 
2401 TTTCATCAAA QIAAGACAGT ATGAI^AAAT ACTXATAGAA AXTIGTGGAA AAAAGGCTAT 
2461 AGOTACAGTA TCAGTAGGGC CTAC3WX3VGT GAACATAATT GGAAGAAATC TOTTAACTCA 
2521 GCTTGGATGC ACACXAAAIT TXOCAATTAG TCCIATZGAA ACXGTACXIAG TAAAATXAAA 
2581 ACCAGGAATG GATGGCCCAA AGGTCAAACA ATGGOCATTG AGAGAAGAAA AAATAAAAGC 
2641 ATTAACAGCA ATTIGIGA0G AAATOGAG^kA GGAAGGAAAA ATIAGAAAAA TXOGGCCTOA 
2701 TAATOCAXAT AAGACICCAO TATITGOCAT AAAAAAGAAG CACAOTACTA AGTXSGAGAAA 
27€1 ATTASZAGAT TTCAOOaAAC TCAATAAAAO AACTGAAGAC TTTTGGQAAG TTCAAJraAGG 
2821 AA7A0CACAC CCAGCAGGAT TAAAAAAOAA AAAATGA6T6 ACAGXGCZAG AXGXOGGGGA 
2881 TGG3li:ATTTr TCAOTTOCXT TAOATGAAAG CTTCAGOAAA TAmCTGCSlT TGAOCAIAOC 
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3941 TAGTATAA^C AAT^AAACAC CAOGGATTAG JTATC^ S^^S 
3001 GAAAGGATCA CCAGCAATAT TCX»flROTAfl CATOACAAAA ATCTTA^hGC CCTTCAO^C 
^JamMCCR GACATAGTTA TCTATCRATA TATGGATOAC TXOTATOTAO GATCW3ACTT 
>?2i ^JJ^^SS CAAAAATAGA AGAGXTAAGG GAACATTTAT toaaatgqgg 

3«i ^SSaga aacatcaaaa agaaccccca tttctttoga tgggot^a 

llVi t^rSSS ^AAATOGA CAOTACAACC TATACTGCTG CCAGAAAAGG ATAGTTGGAC 

WVi =^ ™S JSSS 

SSS? t^TTT T^JOJ»^ 

oc»™ac« t««*t»c «;ttt^ J^SSS iSl^ 

r." =^ J=s 

nil =s s= ^ ™^ 
iroi s= ?=ss is= ^ ™ ~ 
is= =ss ^ ^ ~ 

— = ^ S S 
^ ^ ^ 

4321 AAAAGAAATA GTAGCTAGCT GTGATAAATG ^CAGCTmRAA ^^^^^ OAAAAATCAT 
4361 AGTCGACTGT AGTCCAGGGA TATGGCAATT AGftTXQTOOC ^JJ^ ^SSaSc 

ir-i s= =J s= 
s= 

lioi otSrOXGCA GGGGAAAGAA TAAXftGACKT AAXAGCAUCA ^^JS^ 

^RAAACAA ^TlftXAAflAIV TICm^TTT TOGGO^ 
4921 XATTTGOAAA GGACCftflOOG AAClftCTCia GARAfiGIGRA ^p!^™.* 
Vsb\ ATAAAa<n«fl TACCAAGGRfl 

ss= =n ™ ™ 

Si SISTJ^ i^ss 

Ifei CAXA<«A^ AOAC^P^ 

S341 CaVGACCAGCT AATTCACATG CATIRTITT6 MWJ^ ^^S^ ^^^<3 

mi =ss 

5701 l^GGGATACT ^0<3«»0 

S761 TCMTTCAGA ATIGGATGCC AACATRGCAG JoS^^ OMCdSaA 

5821 AAATGGAOCC JUSXRflATCCT ^«»CIWU«K: OCT^^ JO^^ ^JJ^ 
S881 CavflCTTOTAA TMWTOCTrr tOCRAIkCRCr OtRflCOOCH TTOXCiaOTT TGCmowK 
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S941 CAAAAGGTTT AQGCATTTCC TATCGCAGOA ASAAGOGOAO ACSWJCOACGA AGOOCTCGTC 

loo\ aSJJatcaa aatcctctat caaagcaota aotacac^a o««^aa 

6061 TCGTAAGTTT AAGTITArPr AAAGQAQTAG ATTATAGATT AGGAGTAGGA GCATTGATAG 
Vlll SSS??AAT CATAGCAATA ATAGTOTGGA CCATAGCATA TATAGAATAT AGGAAArtGG 
Vlll ?ScAAAA GAAAATAGAC -roGTrAATTA AAAGAATTAG GGAAAGAGCA GAAGACAGTG 
62^1 l^^^^ TGATGGGGAC ACAGAAGAAT TOTCAACAAT GGTGGATATG GGGCATCTTA 
63oi SJ??S^ IGCTAATGAT TIGTAACACO GAGGACTXGT GGOTCACAGT CTACBVTGGG 
636i Sacctgtgt GGAQAGAAGC AAAAACTACT CTATXCTQTO catcaoatgc waaocatat 
6421 SoaSSag xgcataatgt ctgggctaca catgcttotg tacccacaoa ccccaaccca 
All Sttcggaaa tgtaacagaa aattttaata tgtggaaaaa taacatggca 

A W ^SSS? atSggatat aatcagttta tggqatcaaa gcctaaagccatgtgtAaag , 

leoi ?5SSSSc TCTGTOTCAC TTTAAACTGT ACAOATACAA ATGTTACAGG TAATAGAACT 
666^ S?aSSJa aScAAATGA TACCAATATT GCAAATGCTA CATATAAGTA IGAAGAAATG 
I'll CTTXCAATGC AACCACAGAA TTAAGA^TA A^TAA AG^ATGCA 

6781 CTCTTTTATA AACTTGATAT AGTACCACTT AATCM^ ^S^S^ SSSJJS 
6 841 TTAATAAATT GCAATACCTC AACCATAACA CAAGCCTGTC CAAAGGTCTC TTTTGACCCG 

690? IS^^^IJ AT^ciGTGC tcx:agcigat TATGCGATTC taaagtgtaa taataagaca 
r Z] SSatggga caggaccatg ttataatctc agcacagtac aatgtacaca iggaattaag 

7021 SSS^ S^ScT AC^rrAAAT GGTAGTCTAG CAGAAGAAGG GATAATAATT 

nil aS?SSaaa atttgacaga gaaxaccaaa acaataatag ^atcttaa ^J^J 
7141 gagattaatt gtacaaggcc caacaataat acaaggaaaa gtgtaaggat aggaccagoa 

72oi SSxTCT ATOCAACAAA TOAOGTAAJA GOAAACATAA GACAAGCACA TTGTAACATT 
726? Stacagaxa GATGGAATAA AACTTTACAA CAGGXAATGA aaaaattagg agagcatttc 
nil SIS^TT TGAACCACAT GCAGOAOGGG ATCXAOAAAT ^^CJJTGCAT 

738^ SSSaatt GTAGAGGAGA ATTTTTCTAT UGCAATACAT CAAACCTGTT xjmoot^ 
7441 TACTACCCXA AGAATGGXAC ATACAAATAC AATGGTAATT CAAGCTTACC CATCACACTC 
7sJi ^S^S TAAAACAAAT TGTAOGCATO -reGCAAGGGG XAGGACAAGC AATGTATOCC 
716? ^^^T AACATOTAflA TCAAACATCA CAGGAATACT ATX«ACACGT 

762? SSSS tSa^C AAACAACQAC ACAflAOaROA CAIICAOACC TGGAGGAGGA 
768? SSSgGAG AAGIGAATXA TATAAATAIA AAGTGOTROA AATXAAGOOl 

7?!? SSoCAcSa GGCAAAAAGA AGAGTCGTGC AGAGAAAAAA AAflAGCAGTG 

VboI SoTOTTOCT tOGGTTCTXG GGAGCAGCAG GAAGCACTAT GGGOQCAGCG 

^86? ??ScGC TGAOQgSa GGCCRflACAA CTOTWmTO GXATAGTGCA ACAGCAAAGC 
7I2? IScTGA AOGC^flA OOOGCRACAQ CATAWTXOC AACTCACAGT CXGGGGCATT 
Zll J^SS^ AGGOGAGAGT CClXSGCtATA GAAAflATACC SSJJS^ 
a041 GGGATTXGGa GCTOCTCIGG AAGACTCAIC TGCACCACIG CIOTOOCTT8 GRACIC«OT 
J!w ^^ra^ AScTGAAiaC AfiATATTTCG OAIAACAJPQA CrTGGA3X3CA GTGGGATAOA 
SSmS ^SSsA AACAAIKCTC AflOTWCIia AAGACTOGCA AAACCAGCAG 
SJ^^ MaS^ ATXAflAMXO OACAAOWGA ATAATCXGTO CiAATTGGTTT 
nil ACXGGCWTO OTATAJTAAAA ATATICM»A 

8341 GGTTXAAGAA TAAXTTTTGC TSXGCTCTCT AXAGTOAATA aBOTraGOCA OGOftXACI^ 
nil TACCCCAAGC COGAGGGGAC TCaACAflOCT CGOAGG^ 

8461 SaGA^G OXOOAGAGCA AflACAOAGAC AGATCCAIAC flATTGGTGAO COJATTCTXG 
Vsll GgSo^ COGOAGOCia TOCCICTICA GCIACCACOG CTTaAOAGAC 

nil ???SSSa tt^otgag ggcagxooaa cnctGoaAC acascagict caggggacta 

S64? SSgSSt TAAGBlTCro CGAAGICTTO TGCAOIKnO «3OTCI^ 

870? CTAAAAAAGA GTGCTATTAG TOOGCTTGAT ACCATAGCAA TAGCAGTAGC I^M^ 
876? S^G^ TAGAATIGGT ACAAAGAATT -WTAGAGCTA TCCTC^ 
8821 ATAAGACAOG OCTTIGAAGC AGCTTTGCXA TAAAATOGGA GGC^OTOOT CAM^?^ 
8881 CATAOXTOGA TOGOCiaCAG TMOMSMMS AWraROAAOA ACiaaflOCAG <»fi<»fl»f^ 
8941 AGXAGGAGCA GCSXCICAAa ACiraOATnO ACATaGOQai CrCACAAGCA CCaACACRCC 
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«oo, ^OTACTAAT GAAGCTTGTQ CXTTOGCTOCA AGC3VCAAGAG GAGGACGGAG ATQTAGGCTT 
9001 TQCTACTAAT GAAHCTLrtai^ aMOACTTAT AAOAGTGCAG TAGATCTCAG 

9061 TCCAGTCAGA CCTCAflGTAC CTTTJJJ^ ^^SIw ScTCTAGGA AAAGGCAAGA 
9121 CTTCTTTTTA AAAQAAAAOG OGGGACTGGA AOOOTWtfOT AAAACTACAC 
9181 AATCCTTGAT TTX3TGGGTCT ATAACACACA 'i^GCTTCCTC CCTG^^TGGC 
9241 ATCGGGGCCA GGGGTCCGAT TCCCACTGAC CTTTOQATOO ^CTTCAAOC TAOTAOC^T 
9301 TOACCCAAOa OAGGTGAAAG AGGCCAATGA AGGAGAAOAC AACTOTTTOC ^^^^^'^^ 
slel SS^SiSr GOAOCAGAGG ATOAAGATAG AGAAGTATTA ^^^^ ISSS 
s Al TCTAGCACAC AGACACATGG CGOGOGAflCT ACA«X^ J^SS^ iSSS^ 
9481 CAGAAGGGAC TTTCCGCCOG GGACTrT<X:A CTOGGGCGTT ^^^^^ SS^SS 
954X GGGACT.GGG AGTGGTCACC CTC^TGCT ^TAT^ ^S^SSS^S 
9601 GGGTCTCTCT CGGTAGACCA Gft^CTGAGCC ^^^^ SgTGTCTGC CCATCTGTTG 
9661 CTGCTTAGGC ctcaataaag cttgccttga gtgcxct^ JJgtcTGGAA AATCTCTAGC 
9721 TGTGACTCTG GTAACTAGAG ATCCCTCAGA CCCTTTGTU*:. 
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i : is the regions for P-sheet deletions 

is the N-linked glycosylation sites for subtype C TVl and TV2. Possible mutation (N-^ Q) or 
deletions can be perfomied 
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Figure 6 
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GagComplPolmut.C 



GCCACCATGGGCGCCCGCGCCAGCATCCTGCGCGGCGGCAAGCTGGACGCCTGGGAGCGCATCCGCCTG 

CGCCCCGGCGGCAAGAAGTGCTACATGATGAAGCACCTGGTGTGGGCCAGCCGCGAGCTGGAGAAGTTC 

GCCCTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCAAGCAGATCATCCGCCAGCTGCACCCCGCC 

CTGCAGACCGGCAGCGAGGAGCTGAAGAGCCTGTTCAACACCGTGGCCACCCTGTACTGCGTGCACGAG 

AAGATCGAGGTCCGCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAACAAGTGCCAGCAG 

AAGATCCAGCAGGCCGAGGCCGCCGACAAGGGCAAGGTGAGCCAGAACTACCCCATCGTGCAGAACCTG 

CAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAG 

AAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCACCCCCCAGGACCTG 

AACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAG 

GAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAG 

CCCCGCGGCAGCGACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGCCTGGATGACCAGCAAC 

CCCCCCATCCCCGTGGGCGACATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTGCGGATG 

TACAGCCCCGTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTG^^ 

TTCTTCAAGACCCTQCGCGCCGAGCAGAGCACCCAGGAG6TGAAGAACTGGATGACCGACACCCTGCTG 

GTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCTCGGCCCCGGCGCCAGCCTGGAGGAG 

ATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAAGGCCCGCGTGCTGGCCGAGGCGATGAGC 

CAGGCCAACACCAGCGTGATGATGCAGAAGAGCAACTTCAAGGGCCCCCGGCGCATCGTCAAGTGCTTC 

AACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGAAGTGC 

GGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCC 

AGCCACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCCCGCCGAGAGC 

TTCCGCTTCGAGGAGACCACCCCCGGCCAGAAGCAGGAGAGCAAGGACCGCGAGACCCTGACCAGCCTG 

AAGAGCCTGTTCGGCAACGACCCCCTGAGCCAAGAATTCGCCGAGGCCATGAGCCAGGCCACCAGCGCC 

AACATCCTGATGCAGCGCAGCAACTTCAAGGGCCCCAAGCGCATCATCAAGTGCTTCAACTGCGGCAAG 

GAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGAAGTGCGGCAAGGAGGGC 

CACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCTTCCGCGAGGACCTGGCCTTCCCCCAGGGC 

AAGGCCCGCGAGTTCCCCAGCGAGCAGAACCGCGCCAACAGCCCCACCAGCCGCGAGCTGCAGGTGCGC 

GGCGACAACCCCCGCAGCGAGGCCGGCGCCGAGCGCCAGGGCACCCTGAACTTCCCCCAGATCACCCTG 

TGGCAGCGCCCCCTGGTGAGCATCAAGGTGGGCGGCCAGATCAAGGAGGCCCTGCTGGACACCGGCGCC 

GACGACACCGTGCTGGAGGAGATGAGCCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGC 

GGCTTCATCAAGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACC 

GTGCTGATCGGCCCCACCCCCGTGAACATCATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTG 

AACTTCCCCATCAGCCCCATCGAGACCGTGCCCGTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTG 

AAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGCCATCTGCGAGGAGATGGAGAAGGAG 

GGCAAGATCACCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGAC 

AGCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTG 

CAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGCTGGACGTGGGCGAC 

GCCTACTTCAGCGTGCCCCTGGACGAGGACTTCCGCAAGTACACCGCCTTCACCATCCCCAGCATCAAC 

AACGAGACCCCCGGCATCCGCTACCAGTACAACGTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATC 

TTCCAGAGCAGCATGACCAAGATCCTGGAGCCCTTCCGCGCCCGCAACCCCGAGATCGTGATCTACCAG 

GCCCCCCTGTACGTGGGCAGCGACCTGGAGATCGGCCAGCACCGCGCCAAGATCGAGGAGCTGCGCAAG 

CACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGCCCATC 

6AGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCGAGCTGCCCGAGAAGGAGAGCTGGACCGTGAAC 

GACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTACCCCGGCATCAAGGTGCGCCAG 

CTGTGCAAGCTGCTGCGCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTG 

GAGCTGGCCGAGAACCGCGAGATCCTGCGCGAGCCCGTGCACGGCGTGTACTACGACCCCAGCAAGGAC 

CTGGTGGCCGAGATCCAGAAGCAGGGCCACGACCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAG 

AACCTGAAGACCGGCAAGTACGCCAAGATGCGCACCGCCCACACCAACGACGTGAAGCAGCTGACC6AG 

GCCGTGCAGAAGATCGCCATGGAGAGCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCATC 

CAGAAGGAGACCTGGGAGACCTGGTGGACCGACTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTC 

GTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACC 

TTCTACGTGGACGGCGCCGCCAACCGCGAGACCAAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGC 

CGGCAGAAGATCGTGAGCCTGACCGAGACCACCAACCAGAAGACCGAGCTGCAGGCCATCCAGCTGGCC 

CTGCAGGACAGCGGCAGCGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCC 

CAGCCCGACAAGAGCGAGAGCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTG 
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TACCTGAGCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGATCGACAAGCTGGTGAGCAAG 

GGCATCCGCAAGGTGCTGTTCCTGGACGGCATCGATGGCGGCATCGTGATCTACCAGTACATGGACGAC 
CTGTACGTGGGCAGCGGCGGCCCTAGGATCGATTAAAAGCTTCCCGGGGCTAGCACCGGTTCTAGA 
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Figure 7 
(Sheet 1 of 2) 

GagCompIPolmutAtt.C 

gccaccatgggcgcccgcgccagcatcc[xx:gcggcggcaagctggacgcctgg 

atccgcctgcgccccggcggcaagaagtgctacatgatgaagcacctggtgtgggccagc 

cgcgagctggagaagttcgccctgaaccccggcxtgctggagaccagcgagggctgcaag 

cagatcatccgccagctgcaccccgccctgcagaccggcagcgaggagctgaagagcctg 

ttcaacaccgtggccaccctgtactgcgtgcacgagaagatcgaggtccgcgacaccaag 

gaggccctggacaagatcgaggaggagcagaacaagtgccagcagaagatccagcaggc 

CGAGGCCGCCGACAAGGGCAAGGTGAGCCAGAACTACCCCATCGTGCAGAACCTGCAGG 

GCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCG 

AGGAGAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCA 

CCCCCCAGGACCTGAACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGA 

TGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCAC 

GCCGGCCCCATCGCCCCCGGCCAGATGCGCGAGCCCCGCGGCAGCGACATCGCCGGCACC 

ACCAGCACCCTGCAGGAGCAGATCGCCTGGATGACCAGCAACCCCCCCATCCCCGTGGGC 

GACATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTGCGGATGTACAGCCCC 

GTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGC 

TTCTTCAAGACCCTGCGCGCCGAGCAGAGCACCCAGGAGGTGAAGAACTGGATGACCGAC 

ACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCrCGGCCCC 

GGCGCCAGCCTGGAGGAGATGAtGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAA 

GGCCCGCGTGCTGGCCGAGGCGATGAGCCAGGCCAACACCAGCGTGATGATGCAGAAGA 

GCAACTTCAAGGGCCCCCGGCGCATCGTCAAGTGCTTCAACTGCGGCAAGGAGGGCCACA 

TCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGAAGTGCGGCAAGGAGGGC 

CACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCCAGC 

CACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCCCGCC 

GAGAGCTTCCGCTTCGAGGAGACCACCCCCGGCCAGAAGCAGGAGAGCAAGGACCGCGA 

GACCCTGACCAGCCTGAAGAGCCTGTTCGGCAACGACCCCCTGAGCCAAGAATTCGCCGA 

GGCCATGAGCCAGGCCACCAGCGCCAACATCCTGATGCAGCGCAGCAACTTCAAGGGCCC 

CAAGCGCATCATCAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCG 

CGCCCCCCGCAAGAAGGGCTGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACT 

GCACCGAGCGCCAGGCCAACTTCTTCCGCGAGGACCTGGCCTTCCCCCAGGGCAAGGCCC 

GCGAGTTCCCCAGCGAGCAGAACCGCGCCAACAGCCCCACCAGCCGCGAGCTGCAGGTGC 

GCGGCGACAACCCCCGCAGCGAGGCCGGCGCCGAGCGCCAGGGCACCCTGAACTTCCCCC 

AGATCACCCTGTGGCAGCGCCCCCTGGTGAGCATCAAGGTGGGCGGCCAGATCAAGGAGG 

CCCTGCTGGACTCCGGCGCCGACGACACCGTGCTGGAGGAGATGAGCCTGCCCGGCAAGT 

GGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCAAGGTGCGCCAGTACGACCAGA 

TCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGATCGGCCCCACCCCCG 

TGAACATCATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTCCCCATCA 

GCCCCATCGAGACCGTGCCCGTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGC 

AGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGCCATCTGCGAGGAGATGGAG 

AAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCC 

ATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAA 

GCGCACCCAGGACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAA 

GAAGAAGAGCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACGA 

GQACTTCCGCAAGTACACCGCCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCAT 

CCGCTACCAGTACAACGTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATCTTCCAGAG 

CAGCATGACCAAGATCCTGGAGCCCTTCCGCGCCCGCAACCCCGAGATCGTGATCTACCA 

GGCCCCCCTGTACGTGGGCAGCGACCTGGAGATCGGCCAGCACCGCGCCAAGATCGAGGA 

GCTGCGCAAGCACCTGCTGCGCTGGGGCTrCACCACCCCCGACAAGAAGCACCAGAAGGA 

GCCCCCCTTCCTGCCCATCGAGCTGCACCCCGACAAGTGGACrGTGCAGCCCATCGAGCT 

GCCCGAGAAGGAGAGCTGOACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACT 

GGGCCAGCCAGATCTACCCCGGCATCAAGGTGCGCCAGCrGTGCAAGCTGCTGCGCGGCG 

CCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGA 

ACCGCGAGATCCTGCGCGAGCCCGTGCACGGCGTGTACTACGACCCCAGCAAGGACCTGG 

TGGCCGAGATCCAGAAGCAGGGCCACGACCAGTGGACCTACCAGATCTACCAGGAGCCCT 
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TCAAGAACCTGAAGACCGGCAAGTACGCCAAGATGCGCACCGCCCACACCAACGACGTG 

AAGCAGCTGACCGAGGCCGTGCAGAAGATCGCCATGGAGAGCATCGTGATCTGGGGCAA 

GACCCCCAAGTTCCGCCTGCCCATCCAGAAGGAGACCTGGGAGACCTGGTGGACCGACTA 

CTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCT 

GTGGTACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGC 

CGCCAACCGCGAGACCAAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGA 

AGATCGTGAGCCTGACCGAGACCACCAACCAGAAGACCGAGCTGCAGGCCATCCAGCTG 

GCCCTGCAGGACAGCGGCAGCGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGG 

CATCATCCAGGCCCAGCCCGACAAGAGCGAGAGCGAGCTGGTGAACCAGATCATCGAGC 

AGCTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCACAAGGGCATCGGC 

GGCAACGAGCAGATCGACAAGCTGGTGAGCAAGGGCATCCGCAAGGTGCTGTTCCTGGAC 

GGCATCGATGGCGGCATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCAGCGGC 

GGCCCTAGGATCGATTAAAAGCTTCCCGGGGCTAGCACCGGTTCTAGA 
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Figure 8 
(Sheet 1 of 2) 

GagComplPolmutlna.C 

GCCACCATGGGCGCCCGCGCCAGCATCCTGCGCGGCGGCAAGCTGGACGCCTGGGAGCGC 

ATCCGCCTGCGCCCCGGCGGCAAGAAGTGCTACATGATGAAGCACCTGGTGTGGGCCAGC 

CGCGAGCTGGAGAAGTTCGCCCTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCAAG 

CAGATCATCCGCCAGCTGCACCCCGCCCTGCAGACCGGCAGCGAGGAGCTGAAGAGCCTG 

TTCAACACCGTGGCCACOTGTACTGCGTGCACGAGAAGATCGAGGTCCGCGACACCAAG 

GAGGCCCTGGACAAGATCGAGGAGGAGCAGAACAAGTGCCAGCAGAAGATCCAGCAGGC 

CGAGGCCGCCGACAAGGGCAAGGTGAGCCAGAACTACCOCATCGTGCAGAACCTGCAGG 

GCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCG 

AGGAGAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCA 

CCCCCCAGGACCTGAACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGA 

TGCTGAAGGACACCATCAACGAGGAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCAC 

GCCGGCCCCATCGCCCCCGGCCAGATGCGCGAGCCCCGCGGCAGCGACATCGCCGGCACC 

ACCAGCACCCTGCAGGAGCAGATCGCCTGGATGACCAGCAACCCCCCCATCCCCGTGGGC 

GACATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTGCGGATGTACAGCCCC 

GTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGC 

ITCITCAAGACCCTGCGCGCCGAGCAGAGCACCCAGGAGGTGAAGAACTGGATGACCGAC 

ACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCTCGGCCCC 

GGCGCCAGCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAA 

GGCCCGCGTGCTGGCCGAGGCGATGAGCCAGGCCAACACCAGCGTGATGATGCAGAAGA 

GCAACTTCAAGGGCCCCCGGCGCATCGTCAAGTGCnTCAACTGCGGCAAGGAGG^ 

TCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCKKn'GGAAGTGCGGCAAGGAG 

CACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACITCCTGGGCAAGATCTGGCCCAGC 

CACAAGGGCCGCCCCGGCAACITCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCCCGCC 

GAGAGCTTCCGCTTCGAGGAGACCACCCCCGGCCAGAAGCAGGAGAGCAAGGACCGCGA 

GACCCTGACCAGCCTGAAGAGCCTGTTCGGCAACGACCCCCTGAGCCAAGAATTCGCCGA 

GGCCATGAGCCAGGCCACCAGCGCCAACATCCTGATGCAGCGCAGCAACTTCAAGGGCCC 

CAAGCGCATCATCAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCG 

CGCCCCCCGCAAGAAGGGCTGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACT 

GCACCGAGCGCCAGGCCAACTTCTTCCGCGAGGACCTGGCCTTCCCCCAGGGCAAGGCCC 

GCGAGTTCCCCAGCGAGCAGAACCGCGCCAACAGCCCCACCAGCCGCGAGCTGCAGGTGC 

GCGGCGACAACCCCCGCAGCGAGGCCGGCGCCGAGCGCCAGGGCACCCTGAACTTCCCCC 

AGATCACCCTGTGGCAGCGCCCCCTGGTGAGCATCAAGGTGGGCGGCCAGATCAAGGAGG 

CCCTGCTGGCCACCGGCGCCGACGACACCGTGCTGGAGGAGATGAGCCTGCCCGGCAAGT 

GGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCAAGGTGCGCCAGTACGACCAGA 

TCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGATCGGCCCCACCCCCG 

TGAACATCATCGGCCGCAACATGCTGACCCAGCrGGGCTGCACCCTGAACITCCCCATCA 

GCCCCATCGAGACCGTGCCCGTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGC 

AGTGGCCCCTGACCGAGGAGAAOATCAAGGCCCTGACCGCCATCTGCGAGGAGATGGAG 

AAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCC 

ATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAA 

GCGCACCCAGGACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAA 

GAAGAAGAGCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACGA 

GGACTTCCGCAAGTACACCGCCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCAT 

CCGCTACCAGTACAACGTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATCTTCCAGAG 

CAGCATGACCAAGATCCTGGAGCCCTTCCGCGCCCGCAACCCCGAOATCGTGATCTACCA 

GGCCCCCCTGTACGTGGGCAGCGACCTGGAGATCGOCCAGCACCGCGCCAAGATCGAGGA 

GCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGA 

GCCCCCCTTCCTGCCCATCGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCGAGCT 

GCCCGAGAAGGAGAGCTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACT 

GGGCCAGCCAGATCTACCCCGGCATCAAGGTGCGCCAGCTGTGCAAGCTGCTGCGCGGCG 

CCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGA 

ACCGCGAGATCCTGCGCGAGCCCGTGCACGGCGTGTACTACGACCCCAGCAAGGACCTGG 

TGGCCGAGATCCAGAAGCAGGGCCACGACCAGTGGACCTACCAGATCTACCAGGAGCCCT 

TCAAGAACCTGAAGACCGGCAAGTACGCCAAGATGCGCACCGCCCACACCAACGACGTG 



14/158 



wo 03/004620 PCT/US02/21420 

Figure 8 
(Sheet 2 of 2) 

MGCAGCTGACCGAGGCCGTGCAGAAGATCGCCATGGAGAGCATCGTGATCTGGGGCAA 

GACCCCCAAGTrCCGCCTGCCCATCCAGAAGGAGACCTGGGAGA(XTGGTGGACCG^ 

CTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCT 

GTGGTACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGC 

CGCCAACCGCGAGACCAAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGA 

AGATCGTGAGCCTGACCGAGACCACCAACCAGAAGACCGAGCTGCAGGCCATCCAGCTG 

GCCCTGCAGGACAGCGGCAGCGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGG 

CATCATCCAGGCCCAGCCCGACAAQAGCGAGAGCGAGCTGGTGAACCAGATCATCGAGC 

AGCTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCACAAGGGCATCGGC 

GGCAACGAGCAGATCGACAAGCTGGTGAGCAAGGGCATCCGCAAGGTGCTGTTCCTGGAC 

GGCATCGATGGCGGCATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCAGCGGC 

GGCCCTAGGATCGATTAAAAGCITCCCGGGGCTAGCACCGGTTCTAGA 
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Figure 9 
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GagComplPolmutlnaTatRevNeLC 

GCCACCATGGGCGCCCGCGCCAGCATCCTGCGCGGCGGCAAGCTGGACGCCTGGGAGCGCATCCGCCTG 

CGCCCCGGCGGCAAGAAGTGCTACATGATGAAGCACCTGGTGTGGGCCAGCCGCGAGCTGGAGAAGTTC 

GCCCTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCAAGCAGATCATCCGCCAGCTGCACCCCGCC 

CTGCAGACCGGCAGCGAGGAGCTGAAGAGCCTGTTCAACACCGTGGCCACCCTGTACTGCGTGCACGAG 

AAGATCGAGGTCCGCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAACAAGTGCCAGCAG 

AAGATCCAGCAGGCCGAGGCCGCCQACAAGGGCAAGGTGAGCCAGAACTACCCCATCGTGCAGAACCTG 

CAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAG 

AAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCACCCCCCAGGACCTG 

AACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAG 

GAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAG 

CCCCGCGGCAGCGACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGCCTGGATGACCAGCAAC 

CCCCCCATCCCCGTGGGCGACATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTGCGGATG 

TACAGCCCCGTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGC 

TTCTTCAAGACCCTGCGCGCCGAGCAGAGCACCCAGGAGGTGAAGAACTGGATGACCGACACCCTGCTG 

GTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCTCGGCCCCGGCGCCAGCCTGGAGGAG 

ATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAAGGCCCGCGTGCTGGCCGAGGCGATGAGC 

CAGGCCAACACCAGCGTGATGATGCAGAAGAGCAACTTCAAGGGCCCCCGGCGCATCGTCAAGTGCTTC 

AACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGAAGTGC 

GGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCC 

AGCCACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCCCGCCGAGAGC 

TTCCGCTTCGAGGAGACCACCCCCGGCCAGAAGCAGGAGAGCAAGGACCGCGAGACCCT6ACCAGCCTG 

AAGAGCCTGTTCGGCAACGACCCCCTGAGCCAAGAATTCGCCGAGGCCATGAGCCAGGCCACCAGCGCC 

AACATCCTGATGCAGCGCAGCAACTTCAAGGGCCCCAAGCGCATCATCAAGTGCTTCAACT^ 

GAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGAAGTGCGGCAAGGAGGGC 

CACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCTTCCGCGAGGACCTGGCCTTCCCCCAGGGC 

AAGGCCCGCGAGTTCCCCAGCGAGCAGAACCGCGCCAACAGCCCCACCAGCCGCGAGCTGCAGGTGCGC 

GGCGACAACCCCCGCAGCGAGGCCGGCGCCGAGCGCCAGGGCACCCTGAACTTCCCCCAGATCACCCTG 

TGGCAGCGCCCCCTGGTGAGCATCAAGGTGGGCGGCCAGATCAAGGAGGCCCTGCTGGCCACCGGCGCC 

GACGACACCGTGCTGGAGGAGATGAGCCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCG6CATCGGC 

GGCTTCATCAAGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACC 

GTGCTGATCGGCCCCACCCCCGTGAACATCATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTG 

AACTTCCCCATCAGCCCCATCGAGACCGTGCCCGTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTG 

AAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGCCATCTGCGAGGAGATGGAGAAGGAG 

GGCAAGATCACCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGAC 

AGCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTG 

CAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGCTGGACGTGGGCGAC 

GCCTACTTCAGCGTGCCCCTGGACGAGGACTTCCGCAAGTACACCGCCTTCACCATCCCCAGCATCAAC 

AACGAGACCCCCGGCATCCGCTACCAGTACAACGTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATC 

TTCCAGAGCAGCATGACCAAGATCCTGGAGCCCTTCCGCGCCCGCAACCCCGAGATCGTGATCTACCAG 

GCCCCCCTGTACGTGGGCAGCGACCTGGAGATCGGCCAGCACCGCGCCAAGATCGAGGAGCTGCGCAAG 

CACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAA6CACCAGAAGGAGCCCCCCTTCCTGCCCATC 

GAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCGAGCTGCCCGAGAAGGAGAGCTGGACCGTGAAC 

GACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTACCCCGGCATCAAGGTGCGCCAG 

CTGTGCAAGCTGCTGCGCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTG 

GAGCTGGCCGAGAACCGCGAGATCCTGCGCGAGCCCGTGCACGGCGTGTACTACGACCCCAGCAAGGAC 

CTGGTGGCCGAGATCCAGAAGCAGGGCCACGACCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAG 

AACCTGAAGACCGGCAAGTACGCCAAGATGCGCACCGCCCACACCAACGACGTGAAGCAGCTGACCGAG 

GCCGTGCAGAAGATCGCCATGGAGAGCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCATC 

CAGAAGGAGACCTGGGAGACCTGGTGGACCGACTACTG6CAGGCCACCTGGATCCCCGAGTGGGAGTTC 

GTGAACACCCCCCCCCTGGTQAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACC 

TTCTACGTGGACGGCGCCGCCAACCGCGAGACCAAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGC 

CGGCAGAAGATCGTGAGCCTGACCGAGACCACCAACCAGAAGACCGAGCTGCAGGCCATCCAGCTGGCC 

CTGCAGGACAGCGGCAGCGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCC 

CAGCCCGACAAGAGCGAGAGCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAG^^ 
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TACCTGAGCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGATCGACAAGCTGGTGAGCAAG 
GGCATCCGCAAGGTGCTGTTCCTGGACGGCATCGATGGCGGCATCGTGATCTACCAGTACATGGACGAC 
CTGTACGTGGGCAGCGGCGGCCCTAGGGAGCCCGTGGACCCCAACCTGGAGCCCTGGAACCACCCCGGC 
AGCCAGCCCAAGACCGCCGGCAACAAGTGCTACTGCAAGCACTGCAGCTACCACTGCCTGGTGAGCTTC 
CAGACCAAGGGCCTGGGCATCAGCTACGGCCGCAAGAAGCGCCGCCAGCGCCGCAGCGCCCCCCCCAGC 
AGCGAGGACCACCAGAACCCCATCAGCAAGCAGCCCCTGCCCCAGACCCGCGGCGACCCCACCGGCAGC 
GAGGAGAGCAAGAAGAAGGTGGAGAGCAAGACCGAGACCGACCCCTTCGACCCCGGGGCCGGCCGCAGC 
GGCGACAGCGACGAGGCCCTGCTGCAGGCCGTGCGCATCATCAAGATCCTGTACCAGAGCAACCCCTAC 
CCCAAGCCCGAGGGCACCCGCCAGGCCGACCTGAACCGCCGCCGCCGCTGGCGCGCCCGCCAGCGCCAG 
ATCCACAGCATCAGCGAGCGCATCCTGAGCACCTGCCTGGGCCGCCCCGCCGAGCCCGTGCCCTTCCAG 
CTGCCCCCCGACCTGCGCCTGCACATCGACTGCAGCGAGAGCAGCGGCACCAGCGGCACCCAGCAGAGC 
CAGGGCACCACCGAGGGCGTGGGCAGCCCCCTCGAGGCCGGCAAGTGGAGCAAGAGCAGCATCGTGGGC 
TGGCCCGCCGTGCGCGAGCGCATCCGCCGCACCGAGCCCGCCGCCGAGGGCGTGGGCGCCGCCAGCCAG 
GACCTGGACAAGCACGGCGCCCTGACCAGCAGCAACACCGCCGCCAACAACGCCGACTGCGCCTGGCTG 
GAGGCCCAGGAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACC 
TACAAGGCCGCCTTCGACCTGAGCTTCTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACAGC 
AAGAAGCGCCAGGAGATCCTGGACCTGTGGGTGTACCACACCCAGGGCTTCTTCCCCGGCTGGCAGAAC 
TACACCCCCGGCCCCGGCGTGCGCTACCCCCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGAC 
CCCCGCGAGGTGGAGGAGGCCAACAAGGGCGAGAACAACTGCCTGCTGCACCCCATGAGCCAGCACGGC 
ATGGAGGACGAGGACCGCGAGGTGCTGAAGTGGAAGTTCGACAGCAGCCTGGCCCGCCX3CCACATGGCC 
CGCGAGCTGCACCCCGAGTACTACAAGGACTGCGCCTAA 
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GagPolmut.C 

GCCACCATGGGCGCCCGCGCCy^GCATCCTGCGCGGCGGCAAGCTGGACGCCTGGGAGCGCATC 

CGCCCCGGCGGCAAGAAGTGCTACATGATGAAGCACCTGGTGTGGGCCAGCCGCGAGCTGGAGAAGTTC 

GCCCTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCAAGCAGATCATCCGCCAGCTGCACCCCGCC 

CTGCAGACCGGCAGCGAGGAGCTGAAGAGCCTGTTCAACACCGTGGCCACCCTGTACTGCGTGCACGAG 

AAGATCGAGGTCCGCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAACAAGTGCCAGCAG 

AAGATCCAGCAGGCCGAGGCCGCCGACAAGGGCAAGGTGAGCCAGAACTACCCCATCGTGCAGAACCTG 

CAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAG 

AAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCACCCCCCAGGACCTG 

AACACGATGTTGAACACCGTGGGCGGCCACCAG6CCGCCATGCAGATGCTGAAGGACACCATCAACGAG 

GAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAG 

CCCCGCGGCAGCGACATCGCCGGCACCyiCCAGCACCCTGCAGGAGCAGATCGCCTGGATGACCAGCAAC 

CCCCCCATCCCCGTGGGCGACATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTGCGGATG 

TACAGCCCCGTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGC 

TTCTTCAAGACCCTGCGCGCCGAGCAGAGCACCCAGGAGGTGAAGAACTGGATGACCGACACCCTGCTG 

GTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCTCGGCCCCGGCGCCAGCCTGGAGGAG 

ATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAAGGCCCGCGTGCTGGCCGAGGCGATGAGC 

CAGGCCAACACCAGCGTGATGATGCAGAAGAGCAACTTTAAAAAGGGCCCCAAGCGCATCATCAAGTGC 

TTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGAAG 

TGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCTTCCGCQAGGACCTG 

GCCTTCCCCCAGGGCAAGGCCCGCGAGTTCCCCAGCGAGCAGAACCGCGCCAACAGCCCCACCAGCCGC 

GAGCTGCAGGTGCGCGGCGACAACCCCCGCAGCGAGGCCGGCGCCGAGCGCCAGGGCACCCTGAACTTC 

CCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGAGCATCAAGGTGGGCGGCCAGATCAAGGAGGCCCTG 

CTGGACACCGGCGCCGACGACACCGTGCTGGAGGAGATGAGCCTGCCCGGCAAGTGGAAGCCCAAGATG 

ATCGGCGGCATCGGCGGCTTCATCAAGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAG 

AAGGCCATCGGCACCGTGCTGATCGGCCCCACCCCCGTGAACATCATCGGCCGCAACATGCTGACCCAG 

CTGGGCTGCACCCTGAACTTCCCCATCAGCCC'CATCGAGACCGTGCCCGTGAAGCTGAAGCCCGGCATG 

GACGGCCCCAAGGTGAAGCAGTG6CCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGCCATCTGCGAG 

GAGATGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCC 

ATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAG 

GACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTG 

CTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACGAGGACTTCCGCAAGTACACCGCCTTCACC 

ATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGCTGCCCCAGGGCTGGAAG 

GGCAGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATCCTGGAGCCCTTCCGCGCCCGCAACCCCGAG 

ATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAGCGACCTGGAGATCGGCCAGCACCGCGCCAAGATC 

GAGGAGCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCC 

CCCTTCCTGCCCATCGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCGAGCTGCCCGAGAAGGAG 

AGCTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTACCCCGGC 

ATCAAGGTGCGCCAGCTGTGCAAGCTGCTGCGCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACC 

GAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGCGCGAGCCCGTGCACGGCGTGTACTAC 

GACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAAGCAGGGCCACGACCAGTGGACCTACCAGATCTAC 

CAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCAAGATGCGCACCGCCCACACCAACGACGTG 

AAGCAGCTGACCGAGGCCGTGCAGAAGATCGCCATGGAGAGCATCGTGATCTGGGGCAAGACCCCCAAG 

TTCCGCCTGCCCATCCAGAAGGAGACCTGGGAGACCTGGTGGACCGACTACTGGCAGGCCACCTGGATC 

CCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATC 

ATCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACCAAGATCGGCAAG6CCGGCTAC 

GTGACCGACCGGGGCCGGCAGAAGATCGTGAGCCTGACCGAGACCACCAACCAGAAGACCGAGCTGCAG 

GCCATCCAGCTGGCCCTGCAGGACA6CGGCAGCGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTG 

GGCATCATCCAGGCCCAGCCCGACAAGAGCGAGAGCGAGCTGGTGAACCAGATCATCGAGCAGCTGATC 

AAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGATCGAC 

AAGCTGGTGAGCAAGGGCATCCGCAAGGTGCTGTTCCTGGACGGCATCGATGGCGGCATCGTGATCTAC 

CAGTACATGGACGACCTGTACGTGGGCAGCGGCGGCCCTAGGATCGATTAAAAGCTTCCCGGGGCTAGC 

ACCGGTTCTAGA 
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GagPolmutAtt.C 

GTCGACGCCACCATGGGCGCCCGCGCCAQCATCCTGCGCGGCGGCAAGCTGGACGCCTGGGAGCGCATC 

CGCCTGCGCCCCGGCGGCAAGAAGTGCTACATGATGAAGCACCTGGTGTGGGCCAGCCGCGAGCTGGAG 

AAGTTCGCCCTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCAAGCAGATCATCCGCCAGCTGCAC 

CCCGCCCTGCAGACCGGCAGCGAGGAGCTGAAGAGCCTGTTCAACACCGTGGCCACCCTGTACTGCGTG 

CACGAGAAGATCGAGGTCCGCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAACAAGTGC 

CAGCAGAAGATCCAGCAGGCCGAGGCCGCCGACAAGGGCAAGGTGAGCCAGAACTACCCCATCGTGCAG 

AACCTGCAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATC 

GAGGAGAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCACCCCCCAG 

GACCTGAACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATC 

AACGAGGAGGCCGCC6AGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATG 

CGCGAGCCCCGCGGCAGCGACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGCCTGGATGACC 

AGCAACCCCCCCATCCCCGTGGGCGACATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTG 

CGGATGTACAGCCCCGTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTG 

GACCGCTTCTTCAAGACCCTGCGCGCCGAGCAGAGCACCCAGGAGGTGAAGAACTGGATGACCGACACC 

CTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCTCGGCCCCGGCGCCAGCCTG 

GAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAAGGCCCGCGTGCTGGCCGAGGCG 

ATGAGCCAGGCCAACACCAGCGTGATGATGCAGAAGAGCAACTTTAAAAAGGGCCCCAAGCGCATCATC 

AAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGC 

TGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCTTCCGCGAG 

GACCTGGCCTTCCCCCAGGGCAAGGCCCGCGAGTTCCCCAGCGAGCAGAACCGCGCCAACAGCCCCACC 

.AGCCGCGAGCTGCAGGTGCGCGGCGACAACCCCCGCAGCGAGGCCGGCGCCGAGCGCCAGGGCACCCTG 

AACTTCCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGAGCATCAAGGTGGGCGGCCAGATCAAGGAG 

GCCCTGCTGGACTCCGGCGCCGACGACACCGTGCTGGAGGAGATGAGCCTGCCCGGCAAGTGGAAGCCC 

AAGATGATCGGCGGCATCGGCGGCTTCATCAAGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGC 

GGCAAGAAGGCCATCGGCACCGTGCTGATCGGCCCCyVCCCCCGTGAACATCy^TCGGCCGCAACATGCTG 

ACCCAGCTGGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACCGTGCCCGTGAAGCTGAAGCCC 

GGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGCCATC 

TGCGAGGAGATGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTG 

TTCGCCATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGC 

ACCCAGGACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTG 

ACCGTGCTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACGAGGACTTCCGCAAGTACACCGCC 

TTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGCTGCCCCAGGGC 

TGGAAGGGCAGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATCCTGGAGCCCTTCCGCGCCCGCAAC 

CCCGAGATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAGCGACCTGGAGATCGGCCAGCACCGCGCC 

AAGATCGAGGAGCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAG 

GAGCCCCCCTTCCTGCCCATCGAGCTGCACCCCGACAAGTQGACCGTGCAGCCCATCGAGCTGCCCGAG 

AAGGAGAGCTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTAC 

CCCGGCATCAAGGTGCGCCAGCTGTGCAAGCTGCTGCGCGGCGCCAAGGCCCTGACCGACATCGTGCCC 

CTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGCGCGAGCCCGTGCACGGCGTG 

TACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAAGCAGGGCCACGACCAGTGGACCTACCAG 

ATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCAAGATGCGCACCGCCCACACCAAC 

GACGTGAAGCAGCTGACCGAGGCCGTGCAGAAGATCGCCATGGAGAGCATCGTGATCTGGGGCAAGACC 

CCCAAGTTCCGCCTGCCCATCCAGAAGGAGACCTGGGAGACCTGGTGGACCGACTACTGGCAGGCCACC 

TGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAG 

CCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACCAAGATCGGCAAGGCC 

GGCTACGTGACCGACCGGGGCCGGCAGAAGATCGTGAGCCTGACCGAGACCACCAACCAGAAGACCGAG 

CTGCAGGCCATCCAGCTGGCCCTGCAGGACAGCGGCAGCGAGGTGAACATCGTGACCGACAGCCAGTAC 

GCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAGCGAGAGCGAGCTGGTGAACCAGATCATCGAGCAG 

CTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAG 

ATCGACAAGCTGGTGAGCAAGGGCATCCGCAAGGTGCTGTTCCTGGACGGCATCGATGGCGGCATCGTG 

ATCTACCAGTACATGGACGACCTGTACGTGGGCAGCGGCGGCCCTAGGATCGATTAAAAGCTTCCCGG6 
GCTAGCACCGGTTCTAGA 
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GagPolmutIna„C 

GTCGACGCCACCATGGGCGCCCGCGCCAGCATCCTGCGCGGCGGCAAGCTGGACGCCTGGGAGCGCATC 
CGCCTCCGCCCCGGCGGCAAGAAGTGCTACATGATGAAGCACCTGGTGTGGGCCAGCCGCGAGCTGGAG 
AAGTTCGCCCTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCAAGCAGATCATCCGCCAGCTGCAC 
CCCGCCCTGCAGACCGGCAGCGAGGAGCTGAAGAGCCTGTTCAACACCGTGGCCACCCTGTACTGCGTG 
CACGAGAAGATCGAGGTCCGCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAACAAGTGC 
CAGCAGAAGATCCAGCAGGCCGAGGCCGCCGACAAGGGCAAGGTGAGCCAGAACTACCCCATCGTGCAG 
AACCTGCAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATC 
GAGGAGAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCACCCCCCAG 
GACCTGAACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATC 
AACGAGGAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATG 
CGCGAGCCCCGCGGCAGCGACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGCCTGGATGACC 
AGCAACCCCCCCATCCCCGTGGGCGACATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTG 
CGGATGTACAGCCCCGTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTG 
GACCGCTTCTTCAAGACCCTGCGCGCCGAGCAGAGCACCCAGGAGGTGAAGAACTGGATGACCGACACC 
CTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCTCGGCCCCGGCGCCAGCCTG 
GAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAAGGCCCGCGTGCTGGCCGAGGCG 
ATGAGCCAGGCCAACACCAGCGTGATGATGCAGAAGAGCAACTTTAAAAAGGGCCCCAAGCGCATCATC 
AAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGC 
TGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCTTCCGCGAG 
GACCTGGCCTTCCCCCAGGGCAAGGCCCGCGAGTTCCCCAGCGAGCAGAACCGCGCCAACAGCCCCACC 
AGCCGCGAGCTGCAGGTGCGCGGCGACAACCCCCGCAGCGAGGCCGGCGCCGAGCGCCAGGGCACCCTG 
AACTTCCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGAGCATCAAGGTGGGCGGCCAGATCAAGGAG 
GCCCTGCTGGCCACCGGCGCCGACGACACCGTGCTGGAGGAGATGAGCCTGCCCGGCAAGTGGAAGCCC 
AAGATGATCGGCGGCATCGGCGGCTTCATCAAGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGC 
GGCAAGAAGGCCATCGGCACCGTGCTGATCGGCCCCACCCCC6TGAACATCATCGGCCGCAACATGCTG 
ACCCAGCTGGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACCGTGCCCGTGAAGCTGAAGCCC 
GGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGCCATC 
TGCGAGGAGATGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTG 
TTCGCCATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGC 
ACCCAGGACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTG 
ACCGTGCTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACGAGGACTTCCGCAAGTACACCGCC 
TTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGCTGCCCCAGGGC 
TGGAAGGGCAGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATCCTGGAGCCCTTCCGCGCCCGCAAC 
CCCGAGATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAGCGACCTGGAGATCGGCCAGCACCGCGCC 
AAGATCGAGGAGCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAG 
GAGCCCCCCTTCCTGCCCATCGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCGAGCTGCCCJGAG 
AAGGAGAGCTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTAC 
CCCGGCATCAAGGTGCGCCAGCTGTGCAAGCTGCTGCGCGGCGCCAAGGCCCTGACCGACATCGTGCCC 
CTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGCGCGAGCCCGTGCACGGCGTG 
TACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAAGCAGGGCCACGACCAGTGGACCTACCAG 
ATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCAAGATGCGCACCGCCCACACCAAC 
GACGTGAAGCAGCTGACCGAGGCCGTGCAGAAGATCGCCATGGAGAGCATCGTGATCTGGGGCAAGACC 
CCCAAGTTCCGCCTGCCCATCCAGAAGGAGACCTGGGAGACCTGGTGGACCGACTACTGGCAGGCCACC 
TGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAG 
CCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACCAAGATCGGCAAGGCC 
GGCTACGTGACCGACCGGGGCCGGCAGAAGATCGTGAGCCTGACCGAGACCACCAACCAGAAGACCGAG 
CTGCAGGCCATCCAGCTGGCCCTGCAGGACAGCGGCAGCGAGGTGAACATCGTGACCGACAGCCAGTAC 
GCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAGCGAGAGCGAGCTGGTGAACCAGATCATCGAGCAG 
CTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAG 
ATCGACAAGCTGGTGAGCAAGGGCATCCGCAAGGTGCTGTTCCTGGACGGCATCGATGGCGGCATCGTG 
ATCTACCAGTACATGGACGACCTGTACGTGGGCAGCGGCGGCCCTAGGATCGATTAAAAGCTTCCCGGG 
GCTAGCACCGGTTCTAGA 
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GagProtlnaRTmut^C 

GCCACCATGGGCGCCCGCGCCAGCATCCTGCGCGGCGGCAAGCTGGACGCCTGGGAGCGCATCCGCCTG 

CGCCCCGGCGGCAAGAAGTGCTACATGATGAAGCACCTGGTGTGGGCCAGCCGCGAGCTGGAGAAGTTC 

GCCCTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCAAGCAGATCATCCGCCAGCTGCACCCCGCC 

CTGCAGACCGGCAGCGAGGAGCTGAAGAGCCTGTTCAACACCGTGGCCACCCTGTACTGCGTGCACGAG 

AAGATCGAGGTCCGCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAACAAGTGCCAGCAG 

AAGATCCAGCAGGCCGAGGCCGCCGACAAGGGCAAGGTGAGCCAGAACTACCCCATCGTGCAGAACCTG 

CAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAG 

AAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCACCCCCCAGGACC 

AACACGATGTTGAACACCGTGGGCGGCCT^CCAGGCCGCCATGCAGATGCTG^ 

GAGGCiCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAG 

CCCCGCGGCAGCGACATCGCCGGCACCACCA6CACCCTGCAGGAGCAGATCGCCTGGATGACCAGCAAC 

CCCCCCATCCCCGTGGGCGACATCTACAAGCGGTGGATCATCCTGGGCCTGAACT^GATCGTGCGGATG 

TACAGCCCCGTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGC 

TTCTTCAAGACCCTGCGCGCCGAGCAGAGCACCCAGGAGGTGAAGAACTGGATGACCGACACCCTGCTG 

GTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCTCGGCCCCGGCGCCAGCCTGGAGGAG 

ATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAAG6CCCGCGTGCTGGCCGAGGCGATGAGC 

CAGGCCAACACCAGCGTGATGATGCAGAAGAGCAACTTCAAGGGCCCCCGGCGCATCGTCAAGTGCTTC 

AACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGAAGTGC 

GGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCC 

AGCCACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCCCGCCGAGAGC 

TTCCGCTTCGAGGAGACCACCCCCGGCCAGAAGCAGGAGAGCAAGGACCGCGAGACCCTGACCAGCCTG 

AAGAGCCTGTTCGGCAACGACCCCCTGAGCCAGAAAGAATTCCCCCAGATCACCCTGTGGCAGCGCCCC 

CTGGTGAGCATCAAGGTGGGCGGCCAGATCAAGGAGGCCCTGCTGGCCACCGGCGCCGACGACACCGTG 

CTGGAGGAGATGAGCCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCAAG 

GTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGATCGGC 

CCCACCCCCGTGAACATCATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTCCCC^ 

AGCCCCATCGAGACCGTGCCCGTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCC 

CTGACCGAGGAGAAGATCAAGGCCCTGACCGCCATCTGCGAGGAGATGGAGAAGGAGGGCAAGATCACC 

AAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCACCAAGTGG 

CGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAGCTGGGCATC 

CCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCAGC 

GTGCCCCTGGACGAGGACTTCCGCAAGTACACCGCCTTCACCATCCCCAGCATCAACAACGAGACCCCC 

GGCATCCGCTACCAGTACAACGTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATCTTCCAGAGCAGC 

ATGACCAAGATCCTGGAGCCCTTCCGCGCCCGCAACCCCGAGATCGTGATCTACCAGGCCCCCCTGTAC 

GTGGGCAGCGACCTGGAGATCGGCCAGCACCGCGCCAAGATCGAGGAGCTGCGCAAGCACCTGCTGCGC 

TGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGCCCATCGAGCTGCACCCC 

GACAAGTGGACCGTGCAGCCCATCGAGCTGCCCGAGAAGGAGAGCTGGACCGTGAACGACATCCAGAAG 

CTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTACCCCGGCATCAAGGTGCGCCAGCTGTGCAAGCTG 

CTGCGCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAG 

AACCGCGAGATCCTGCGCGAGCCCGTGCACGGCGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAG 

ATCCAGAAGCAGGGCCACGACCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACC 

GGCAAGTACGCCAAGATGCGCACCGCCCACACCAACGACGTGAAGCAGCTGACCGAGGCCGTGCAGAAG 

ATCGCCATGGAGAGCATCGTGATCTGGGGOVAGACCCCCAAGTTCCGCCTGCCCATCCAGAAGGAGACC 

TGGGAGACCTGGTGGACCGACTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCC 

CCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGAC 

GGCGCCGCCAACCGCGAGACCAAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGATC 

GTGAGCCTGACCGAGACCACCAACCAGAAGACCGAGCTGCAGGCCATCCAGCTGGCCCTGCAGGACAGC 

GGCAGCGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAG 

AGCGAGAGCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACCTGAGCTGG 

Gl'GCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGATCGACAAGCTGGTGAGCAAGGGCATCCGCAAG 
GTGCTCGCTTAA 
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GCCACCATGGGCGCCCGCGCCAGCATCCTGCGCGGCGGCAAGCTGGACGCCTGGGAGCGCATCCGCCTG 

CGCCCCGGCGGCAAGAAGTGCTACATGATGAAGCACCTGGTGTGGGCCAGCCGCGAGCTGGAGAAGO^^ 

GCCCTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCAAGCAGATCATCCGCCAGCTGCACCCCGCC 

CTGCAGACCGGCAGCGAGGAGCTGAAGAGCCTGTTCAACACCGTGGCCACCCTGTACTGCGTGCAC 

AAGATCGAGGTCCGCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAACAAGTGCCAGCAG 

AAGATCCAGCAGGCCGAGGCCGCCGACAAGG6CAAGGTGAGCCAGAACTACCCCATCGTGCAGAACCTG 

CAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAG 

AAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCACCCCCCAGGACCTG 

AACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAG 

GAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAG 

CCCCGCGGCAGCGACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGCCTGGATGACCAGCAAC 

CCCCCCATCCCCGTGGGCGACATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTGCGGATG 

TACAGCCCCGTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGC 

TTCTTCAAGACCCTGCGCGCCGAGCAGAGCACCCAGGAGGTGAAGAACTGGATGACCGACACCCTGCTG 

GTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCTCGGCCCCGGCGCCAGCCTGGAGGAG 

ATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAAGGCCCGCGTGCTGGCCGAGGCGATGAGC 

CAGGCCAACACCAGCGTGATGATGCAGAAGAGCAACTTCAAGGGCCCCCGGCGCATCGTCAAGTGCTTC 

AACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGAAGTGC 

GGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCC 

AGCCACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCCCGCCGAGAGC 

TTCCGCTTCGAGGAGACCACCCCCGGCCAGAAGCAGGAGAGCAAGGACCGaSAGACCCTGACCAGCCTC 

AAGAGCCTGTTCGGCAACGACCCCCTGAGCCAGAAAGAATTCCCCCAGATCACCCTGTGGCAGCGCCCC 

CTGGTGAGCATCAAGGTGGGCGGCCAGATCAAGGAGGCCCTGCTGGCCACCGGCGCCGACGACACCGTG 

CTGGAGGAGATGAGCCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCAAG 

GTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGATCGGC 

CCCACCCCCGTGAACATCATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTCCCCATC 

AGCCCCATCGAGACCGTGCCCGTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCC 

CTGACCGAGGAGAAGATCAAGGCCCTGACCGCCATCTGCGAGGAGATGGAGAAGGAGGGCAAGATCACC 

AAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCACCAAGTGG 

CGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAGCTGGGCATC 

CCCCACCCCGCCGGCCTGT^GAAGAAGAAGAGCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCAGC 

GTGCCCCTGGACGAGGACTTCCGCAAGTACACCGCCTTCACCATCCCCAGCATCAACAACGAGACCCCC 

GGCATCCGCTACCAGTACAACGTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATCTTCCAGAGCAGC 

ATGACCAAGATCCTGGAGCCCTTCCGCGCCCGCAACCCCGAGATCGTGATCTACCAGGCCCCCCTGTAC 

GTGGGCAGCGACCTGGAGATCGGCCAGCACCGCGCCAAGATCGAGGAGCTGCGCAAGCACCTGCTGCGC 

TGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGCCCATCGAGCTGCACCCC 

GACAAGTGGACCGTGCAGCCCATCGAGCTGCCCGAGAAGGAGAGCTGGACCGTGAACGACATCCAGAAG 

CTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTACCCCGGCATCAAGGTGCGCCAGCTGTGCAAGCTG 

CTGCGCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAG 

AACCGCGAGATCCTGCGCGAGCCCGTGCACGGCGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAG 

ATCCAGAAGCAGGGCCACGACCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACC 

GGCAAGTACGCCAAGATGCGCACCGCCCACACCAACGACGTGAAGCAGCTGACCGAGGCCGTGCAGAAG 

ATCGCCATGGAGAGCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCATCCAGAAGGAGACC 

TGGGAGACCTGGTGGACCGACTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCC 

CCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGAC 

GGCGCCGCCAACCGCGAGACCAAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGATC 

GTGAGCCTGACCGAGACCACCAACCAGAAGACCGA6CTGCAG6CCATCCAGCTGGCCCTGGAGGACAGC 

GGCAGCGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAG 

AGCGAGAGCGAGCTGGTGAACCAGATCATCGA6CAGCTGATCAAGAAGGAGAAG6TGTACCTGAGCTGG 

GTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGATCGACAAGCTGGTGAGCAAGGGCATCCGCAAG 

GTGCTCaagcttGAGCCCGTGGACCCCAACCTGGAGCCCTGGAACCACCCCGGCAGCCAGCCCAAGACC 

GCCGGCAACAAGTGCTACTGCAAGCACTGCAGCTACCACTGCCn^GTGAGCTTCCAGACCAAGGGCCTG 

GGCATCAGCTACGGCCGCAAGAAGCGCCGCCAGCGCCGCAGCGCCCCCCCCAGCAGCGAGGACCACCAG 

AACCCCATCAGCAAGCAGCCCCTGCCCCAGACCCGCGGCGACCCCACCGGCAGCGAGGAGAGCAAGAAG 
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AAGGTGGAGA6CAAGACCGA6ACC6ACCCCTTCGACCCCGGG6CCGGCCGCA6C6GC6ACAGCGAC6AG 

GCCCTGCTGCAGGCCGTGCGCATCATCAAGATCCTCTACOIGAGCAACCCCTACCCCAAGCTC^ 

ACCCGCCAGGCCGACCTGAACCGCCGCCGCCGCTGGCGCGCCCGCCAGCGCCAGATCCACAGCATCAGC 

GAGCGCATCCTGAGCACCTGCCTGGGCCGCCCCGCCGAGCCCGTGCCCTTCCAGCTGCCCCCCGACCTG 
CGCCTGCACATCGACTGCAGCGAGAGCAGCGGCACCAGCGGCACCCAGCAGAGCCAGGGCACCACCGAG 
GGCGTGGGCAGCCCCCTCGAGGCCGGCAAGTGGAGCAAGAGCAGCATCGTGGGCTGGCCCGCCGTGCGC 
GAGCGCATCCGCCGCACCGAGCCCGCCGCCGAGGGCGTGGGCGCCGCCAGCCAGGACCTGGACAAGCAC 
GGCGCCCTGACCAGCAGCAACACCGCCGCCAACAACGCCGACTGCGCCTGGCTGGAGGCCCAGGAGGAG 
GAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAGGCCGCCTTC 
GACCTGAGCTTCTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACAGCAAGAAGCGCCAGGAG 
ATCCTGGACCTGTGGGTGTACCACACCCAGGGCTTCTTCCCCGGCTGGCAGAACTACACCCCCGGCCCC 
GGCGTGCGCTACCCCCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCCGCGAGGTGGAG 
GAGGCCAACAAGGGCGAGAACAACTGCCTGCTGCACCCCATGAGCCAGCACGGCATGGAGGACGAGGAC 
CGCGAGGTGCTGAAGTGGAAGTTCGACAGCAGCCTGGCCCGCCGCCACATGGCCCGCGAGCTGCACCCC 
GAGTACTACAAGGACTGCGCCTAA 
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GCCACCATGGGCGCCCGCGCCAGCATCCTGCGCGGCGGCAAGCTGGACGCCTGGGAGCGCATCCGCCTG 
CGCCCCGGCGGCAAGAAGTGCTACATGATGAAGCACCTGGTGTGGGCCAGCCGCGAGCTGGAGAAGTTC 
GCCCTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCAAGCAGATCATCCGCCAGCTGCACCCCGCC 
CTGCAGACCGGCAGCGAGGAGCTGAAGAGCCTGTTCAACACCGTGGCCACCCTGTACTGCGTGCACGAG 
AAGATCGAGGTCCGCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAACAAGTGCCAGCAG 
AAGAax:CAGCAGGCCGAGGCCGCCGACAAGGGCAAGGTGAGCCAGAACTACCCCATCGTGCAGAACCTG 
CAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAG 
AAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCACCCCCCAGGACCTG 
AACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAG 
GAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAG 
CCCCGCGGCAGCGACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGCCTGGATGACCAGCAAC 
CCCCCCATCCCCGTGGGCGACATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTGCGGATG 
TACAGCCCCGTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACC6C 
TTCTTCAAGACCCTGCGCGCCGAGCAGAGCACCCAGGAGGTGAAGAACTGGATGACCGACACCCTGCTG 
GTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCTCGGCCCCGGCGCCAGCCTGGAGGAG 
ATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAAGGCCCGCGTGCTGGCCGAGGCGATGAGC 
CAGGCCAACACCAGCGTGATGATGCAGAAGAGCAACTTCAAGGGCCCCCGGCGCATCGTCAAGTGCTTC 
AACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGAAGTGC 
GGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCC 
AGCCACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCCCGCCGAGAGC 
TTCCGCTTCGAGGAGACCACCCCCGGCCAGAAGCAGGAGAGCAAGGACCGCGAGACCCTGACCAGCCTG 
AAGAGCCTGTTCGGCAACGACCCCCTGAGCCAGAAAGAATTCCCCATCAGCCCCATCGAGACCGTGCCC 
GTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAG 
GCCCTGACCGCCATCTGCGAGGAGATGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAACCCC 
\ TACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTGGACTTCCGC 
GAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAG 
AAGAAGAAGAGCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACGAGGACTTC 
CGCAAGTACACCGCCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAAC 
GTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATCCTGGAGCCC 
TTCCGCGCCCGCAACCCCGAGATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAGCGACCTGGAGATC 
GGCCAGCACCGCGCCAAGATCGAGGAGCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGAC 
AAGAAGCACCAGAAGGAGCCCCCCTTCCTGCCCATCGAGCTGCACCCCGACAAGTGGACCGTGCAGCCC 
ATCGAGCTGCCCGAGAAGGAGAGCTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGG 
GCCAGCCAGATCTACCCCGGCATCAAGGTGCGCCAGCTGTGCAAGCTGCTGCGCGGCGCCAAGGCCCTG 
ACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGCGCGAG 
CCCGTGCACGGCGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAAGCAGGGCCACGAC 
CAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCAAGATGCGC 
ACCGCCCACACCAACGACGTGAAGCAGCTGACCGAGGCCGTGCAGAAGATCGCCATGGAGAGaiTCGTC 
ATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCATCCAGAAGGAGACCTGGGAGACCTGGTGGACCGAC 
TACTGGCAGGCCACCTGGATCCCCGAGTGGGAQTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTAC 
CAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACC 
AAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGATCGTGAGCCTGACCGAGACCACC 
AACCAGAAGACCGAGCTGCAGGCCATCCAGCTGGCCCTGCAGGACAGCGGCAGCGAGGTGAACATCGTG 
ACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAGCGAGAGCGAGCTGGTGAAC 
CAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCACAAGGGCATC 
GGCGGCAACGAGCAGATCGACAAGCTGGTGAGCAAGGGCATCCGCAAGGTGCTCTAA 
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GCCACCATGGGCGCCCGCGCCAGCATCCTGCGCGGCGGCAAGCTGGACGCCTGGGAGCGCATCCGCCTG 

CGCCCCGGCGGCAAGAAGTGCTACATGATGAAGCy^CCTGGTCTCGGCCAGCCGCGAGCTGGAGAAGTTC 

GCCCTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCAAGCAGATCATCCGCCAGCTGavC^^ 

CTGCAGACCGGCAGCGAGGAGCTGAAGAGCCTGTTCAACACCGTGGCCACCCTGTACTGCGTGCACGAG 

AAGATCGAGGTCCGCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAACAAGTGCCAGCAG 

AAGATCCAGCAGGCCGAGGCCGCCGACAAGGGCAAGGTGAGCCAGAACTACCCCATCGTGCAGAACCTG 

CAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAG 

AAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCACCCCCCAGGACCTG 

AACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAG 

GAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAG 

CCCCGCGGCAGCGACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGCCTGGATGACCAGCAAC 

CCCCCCATCCCCGTGGGCGACATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTGCGGATG 

TACAGCCCCGTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGC 

TTCTTCAAGACCCTGCGCGCCGAGCAGAGCACCCAGGAGGTGAAGAACTGGATGACCGACACCCTGCTG 

GTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCTCGGCCCCGGCGCCAGCCTGGAGGAG 

ATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAAGGCCCGCGTGCTGGCCGAGGCGATGAGC 

CAGGCCAACACCAGCGTGATGATGCAGAAGAGCAACTTCAAGGGCCCCCGGCGCATCGTCAAGTGCTTC 

AACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGAAGTGC 

GGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCC 

AGCCACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCCCGCCGAGAGC 

TTCCGCTTCGAGGAGACCACCCCCGGCCAGAAGCAGGAGAGCAAGGACCGCGAGACCCTGACCAGCCTG 

AAGAGCCTGTTCGGCAACGACCCCCTGAGCCAGAAAGAATTCCCCATCAGCCCCATCGAGACCGTGCCC 

GTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAG 

GCCCTGACCGCCATCTGCGAGGAGATGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAACCCC 

TACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTGGACTTCCGC 

GAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAG 

AAGAAGAAGAGCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACGAGGACTTC 

CGCAAGTACACCGCCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAAC 

GTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATCCTGGAGCCC 

TTCCGCGCCCGCAACCCCGAGATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAGCGACCTGGAGATC 

GGCCAGCACCGCGCCAAGATCGAGGAGCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGAC 

AAGAAGCACCAGAAGGAGCCCCCCTTCCTGCCCATCGAGCTGCACCCCGACAAGTGGACCGTGCAGCCC 

ATCGAGCTGCCCGAGAAGGAGAGCTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGG 

GCCAGCCAGATCTACCCCGGCATCAAGGTGCGCCAGCTGTGCAAGCTGCTGCGCGGCGCCAAGGCCCTG 

ACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGCGCGAG 

CCCGTGCACGGCGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAA6CAGGGCCACGAC 

CAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCAAGATGCGC 

ACCGCCCACACCAACGACGTGAAGCAGCTGACCGAGGCCGTGCAGAAGATCGCCATGGAGAGCATCGTG 

ATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCATCCAGAAGGAGACCTGGGAGACCTGGTGGACCGAC 

TACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTAC 

CAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACC 

AAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGATCGTGAGCCTGACCGAGACCACC 

AACCAGAAGACCGAGCTGCAGGCCATCCAGCTGGCCCTGCAGGACAGCGGCAGCGAGGTGAACATCGTG 

ACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAGCGAGAGCGAGCTGGTGAAC 

CAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCACAAGGGCATC 

GGCGGCAACGAGCAGATCGAC7AGCTCGTGAGCAAGGGCATCCGCAAGGTGCTCAAGCTTGAGCCCGTG 

GACCCCAACCTGGAGCCCTGGAACCACCCCGGCAGCCAGCCCAAGACCGCCGGCAACAAGTGCTACTGC 

AAGCACTGCAGCTACCACTGCCTGGTGAGCTTCCAGACCAAGGGCCTGGGCATCAGCTACGGCCGCAAG 

AAGCGCCGCCAGCGCCGCAGCGCCCCCCCCAGCAGCGAGGACCACCAGAACCCCATCAGCAAGCAGCCC 

CTGCCCCAGACCCGCGGCGACCCCACCGGCAGCGAGGAGAGCAAGAAGAAGGTGGAGAGCAAGACCGAG 

ACCGACCCCTTCGACCCCGGGGCCGGCCGCAGCGGCGACAGCGACGAGGCCCTGCTGCAGGCCGTGCGC 

ATCATCAAGATCCTGTACCAGAGCAACCCCTACCCC^GCCCGAGGGCACCCGCCAGGCCGACCTCAAC 

CGCCGCCGCCGCTGGCGCGCCCGCCAGCGCCAGATCCACAGCATCAGCGAGCGCATCCTGAGCACCTGC 

CTGGGCCGCCCCGCCGAGCCCGTGCCCTTCCAGCTGCCCCCCGACCTGCGCCTGCACATCGACTGCAGC 
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GAGAGCAGCGGCACCAGCGGCACCCAGCAGAGCCAGGGCACCACCGAGGGCGTGGGCAGCCCCCTCGAG 
GCCGGCAAGTGGAGCAAGAGCAGCATCGTGGGCTGGCCCGCCGTGCGCGAGCGCATCCGCCGCACCGAG 
CCCGCCGCCGAGGGCGTGGGCGCCGCCAGCCAGGACCTGGACAAGCACGGCGCCCTGACCAGCAGCAAC 
ACCGCCGCCAACAACGCCGACTGCGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGAGGTGGGCTTCCCC 
GTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAGGCCGCCTTCGACCTGAGCTTCTTCCTGAAG 
GAGAAGGGCGGCCTGGAGGGCCTGATCTACAGCAAGAAGCGCCAGGAGATCCTGGACCTGTGGGTGTAC 
CACACCCAGGGCTTCTTCCCCGGCTGGCAGAACTACACCCCCGGCCCCGGCGTGCGCTACCCCCTGACC 
TTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCCGCGAGGTGGAGGAGGCCAACAAGGGCGAGAAC 
AACTGCCTGCTGCACCCCATGAGCCAGCACGGCATGGAGGACGAGGACCGCGAGGTGCTGAAGTGGAAG 
TTCGACAGCAGCCTGGCCCGCCGCCACATGGCCC6CGAGCTGCACCCCGAGTACTACAAGGACTGCGCC 
TAA 
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GCCACCATGGGCGCCCGCGCCAGCATCCTGCGCGGCGGCAAGCTGGACGCCTGGGAGCGCATCCGCCTG 

CGCCCCGGCGGCAAGAAGTGCTACATGATGAAGCACCTGGTGTGGGCCAGCCGCGAGCTGGAGAAGTTC 

GCCCTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCAAGCAGATCATCCGCCAGCTGCACCCCGCC 

CTGCAGACCGGCAGCGAGGAGCTQAAGAGCCTGTTCAACACCGTGGCCACCCTCTACTGCGTGCACGAG 

AAGATCGAGGTCCGCGACACCAAGGAGGCCCTGGACAA6ATCGAGGAGGAGCAGAACAAGTGCCAGCAG 

AAGATCCAGCAGGCCGAGGCCGCCGACAAGGGCAAGGTGAGCCA6AACTACCCCATCGTGCAGAACCTG 

CAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAGGTGATCGAGGAG 

AAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCACCCCCCAGGACCTG 

AACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCTGAAGGACACCATCAACGAG 

GAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGCCAGATGCGCGAG 

CCCCGCGGCAGCGACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGCCTGGATGACCAGCAAC 

CCCCCCATCCCCGTGGGCGACATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAGATCGTGCGGATG 

TACAGCCCCGTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGACTACGTGGACCGC 

TTCTTCAAGACCCTGCGCGCCGAGCAGAGCACCCAGGAGGTGAAGAACTGGATGACCGACACCCTGCTG 

GTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCTCGGCCCCGGCGCCAGCCTGGAGGAG 

ATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAAGGCCCGCGTGCTGGCCGAGGCGATGAGC ; 

CAGGCCAACACCAGCGTGATGATGCAGAAGAGCAACTTCAAGGGCCCCCGGCGCATCGTCAAGTGCTTC 

AACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGAAGTGC 

GGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGCAAGATCTGGCCC 

AGCCACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCCCCCGCCGAGAGC 

TTCCGCTTCGAGGAGACCACCCCCGGCCAGAAGCAGGAGAGCAAG6ACCGCGAGACCCT6ACCAGCCTG 

AAGAGCCTGTTCGGCAACGACCCCCTGAGCCAAGAATTCGAGCCCGTGGACCCCAACCTGGAGCCCTGG 

AACCACCCCGGCAGCCAGCCCAAGACCGCCGGCAACAAGTGCTACTGCAAGCACTGCAGCTACCACTGC 

CTGGTGAGCTTCCAGACCAAGGGCCTGGGCATCAGCTACGGCCGCAAGAAGCGCCGCCAGCGCCGCAGC 

GCCCCCCCCAGCAGCGAGGACCACCAGAACCCCATCAGCAAGCAGCCCCTGCCCCAGACCCGCGGCGAC 

CCCACCGGCAGCGAGGAGAGCAAGAAGAAGGTGGAGAGCAAGACCGAGACCGACCCCTTCGACCCCGGG 

GCCGGCCGCAGCGGCGACAGCGACGAGGCCCTGCTGCAGGCCGTGCGCATCATCAAGATCCTGTACCAG 

AGCAACCCCTACCCCAAGCCCGAGGGCACCCGCCAGGCCGACCTGAACCGCCGCCGCCGCTGGCGCGCC 

CGCCAGCGCCAGATCCACAGCATCAGCGAGCGCATCCTGAGCACCTGCCTGGGCCGCCCCGCCGAGCCC 

GTGCCCTTCCAGCTGCCCCCCGACCTGCGCCTGCACATCGACTGCAGCGAGAGCAGCGGCACCAGCGGC 

ACCCAGCAGAGCCAGGGCACCACCGAGGGCGTGGGCAGCCCCCTCGAGGCCGGCAAGTGGAGCAAGAGC 

AGCATCGTGGGCTGGCCCGCCGTGCGCGAGCGCATCCGCCGCACCGAGCCCGCCGCCGAGGGCGTGGGC 

GCCGCCAGCCAGGACCTGGACAAGCACGGCGCCCTGACCAGCAGCAACACCGCCGCCAACAACGCCGAC 

TGCGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTG 

CGCCCCATGACCTACAAGGCCGCCTTCGACCTGAGCTTCTTCCTGAAGGAGAAGGGCGGCCTGGAGGGC 

CTGATCTACAGCAAGAAGCGCCAGGAGATCCTGGACCTGTGGGTGTACCACACCCAGGGCTTCTTCCCC 

GGCTGGCAGAACTACACCCCCGGCCCCGGCGTGCGCTACCCCCTGACCTTCGGCTGGTGCTTCAAGCTG 

GTGCCCGTGGACCCCCGCGAGGTGGAGGAGGCCAACAAGGGCGAGAACAACTGCCTGCTGCACCCCATG 

AGCCAGCACGGCATGGAGGACGAGGACC6CGAGGTGCTGAAGTGGAAGTTCGACAGCAGCCTGGCCCGC 

CGCCACATGGCCCGCGAGCTGCACCCCGAGTACTACAAGGACTGC6CCTAA 
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Figure 18 
(Sheet 1 of 1) 



gpl20mod.TVl.delll8-210 



1 atgcgcgtga tgggoot<ica gaagaactgc cagcagtggt ggatctgggg catcctgggc 

61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 

121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 

181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 

241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 

301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgggcgcc 

361 ggcgcctgcc ccaaggtgag cttcgacccc atccccatcc actactgcgc ccccgccggc 

421 tacgccatcc tgaagtgcaa caacaagacc ttcaacggca ccggcccctg ctacaacgtg 

481 agcaccgtgc agtgcaccca cggcatcaag cccgtggtga gcacccagct gctgctgaac 

541 ggcagcctgg ccgaggaggg catcatcatc cgcagcgaga acctgaccga gaacaccaag 

601 accatcatcg tgcacctgaa cgagagcgtg gagatcaact gcacccgccc caacaacaac 

661 acccgcaaga gcgtgcgcat cggccccggc caggccttct acgccaccaa cgacgtgatc 

721 ggcaacatcc gccaggccca ctgcaacatc agcaccgacc gctggaacaa gaccctgcag 

781 caggtgatga agaagctggg cgagcacttc cccaacaaga ccatccagtt caagccccac 

841 gccggcggcg acctggagat caccatgcac agcttcaact gccgcggcga gttcttctac 

901 tgcaacacca gcaacctgtt caacagcacc taccacagca acaacggcac ctacaagtac 

961 aacggcaaca gcagcagccc catcaccctg cagtgcaaga tcaagcagat cgtgcgcatg 

1021 tggcagggcg tgggccaggc cacctacgcc ccccccatcg ccggcaacat cacctgccgc 

1081 agcaacatca ccggcatcct gctgacccgc gacggcggct tcaacaccac caacaacacc 

1141 gagaccttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 

1201 tacaaggtgg tggagatcaa gcccctgggc atcgccccca ccaaggccaa gcgccgcgtg 
12 61 gtgcagcgcg agaagcgcta a 
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(Sheet 1 of 1) 

gpl2 Omod . TVl . delVlV2 

1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 

61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 

121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 

181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 

241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 

301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 

361 acccccctgt gcgtgggcgc cggcaactgc aacaccagca ccatcaccca ggcctgcccc 

421 aaggtgagct tcgaccccat ccccatccac tactgcgccc ccgccggcta cgccatcctg 

481 aagtgcaaca acaagacctt caacggcacc ggcccctgct acaacgtgag caccgtgcag 

541 tgcacccacg gcatcaagcc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 

601 gaggagggca tcatcatccg cagcgagaac ctgaccgaga acaccaagac catcatcgtg 

661 cacctgaacg agagcgtgga gatcaactgc acccgcccca acaacaacac ccgcaagagc 

721 gtgcgcatcg gccccggcca ggccttctac gccaccaacg acgtgatcgg caacatccgc 

781 caggcccact gcaacatcag caccgaccgc tggaacaaga ccctgcagca ggtgatgaag 

841 aagctgggcg agcacttccc caacaagacc atccagttca agccccacgc cggcggcgac 

901 ctggagatca ccatgcacag cttcaactgc cgcggcgagt tcttctactg caacaccagc 

9 61 aacctgttca acagcaccta ccacagcaac aacggcacct acaagtacaa cggcaacagc 

1021 agcagcccca tcaccctgca gtgcaagatc aagcagatcg tgcgcatgtg gcagggcgtg 

1081 ggccaggcca cctacgcccc ccccatcgcc ggcaacatca cctgccgcag ' caacatcacc 

1141 ggcatcctgc tgacccgcga cggcggcttc aacaccacca acaacaccga gaccttccgc 

12 01 cccggcggcg gcgacatgcg cgacaactgg cgcagcgagc tgtacaagta caaggtggtg 

1261 gagatcaagc ccctgggcat cgcccccacc aaggccaagc gccgcgtggt gcagcgcgag 

1321 aagcgctaa 
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Figure 20 
(Sheet 1 of 1) 

gpl2 Omod . TVl . delV2 

1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 
61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 
121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 
301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 
3 61 acccccctgt gcgtgaccct gaactgcacc gacaccaacg tgaccggcaa ccgcaccgtg 
421 accggcaaca gcaccaacaa caccaacggc accggcatct acaacatcga ggagatgaag 
481 aactgcagct tcaacgccgg cgccggccgc ctgatcaact gcaacaccag caccatcacc 
541 caggcctgcc ccaaggtgag cttcgacccc atccccatcc actactgcgc ccccgccggc 
601 tacgccatcc tgaagtgcaa caacaagacc ttcaacggca ccggcccctg ctacaacgtg 
661 agcaccgtgc agtgcaccca cggcatcaag cccgtggtga gcacccagct gctgctgaac 
721 ggcagcctgg ccgaggaggg catcatcatc cgcagcgaga acctgaccga gaacaccaag 
7 81 accatcatcg tgcacctgaa cgagagcgtg gagatcaact gcacccgccc caacaacaac 
841 acccgcaaga gcgtgcgcat cggccccggc caggccttct acgccaccaa cgacgtgatc 
901 ggcaacatcc gccaggccca ctgcaacatc agcaccgacc gctggaacaa gaccctgcag 
961 caggtgatga agaagctggg cgagcacttc cccaacaaga ccatccagtt caagccccac 
1021 gccggcggcg acctggagat caccatgcac agcttcaact gccgcggcga gttcttctac 
1081 tgcaacacca gcaacctgtt caacagcacc taccacagca acaacggcac ctacaagtac 
1141 aacggcaaca gcagcagccc catcaccctg cagtgcaaga tcaagcagat cgtgcgcatg 
1201 tggcagggcg tgggccaggc cacctacgcc ccccccatcg ccggcaacat cacctgccgc 
1261 agcaacatca ccggcatcct gctgacccgc gacggcggct tcaacaccac caacaacacc 
1321 gagaccttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 
1381 tacaaggtgg tggagatcaa gcccctgggc atcgccccca ccaaggccaa gcgccgcgtg 
1441 gtgcagcgcg agaagcgcta a 
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1 atgcgcgtga tgggcaccca 
61 ttctggatgc tgatgatctg 
121 cccgtgtggc gcgacgccaa 
181 accgaggtgc acaacgtgtg 
241 gagatcgtgc tgggcaacgt 
301 cagatgcacg aggacgtgat 
3 61 ggcgcctgcc ccaaggtgag 
421 tacgccatcc tgaagtgcaa 
481 agcaccgtgc agtgcaccca 
541 ggcagcctgg ccgaggaggg 
601 accatcatcg tgcacctgaa 
661 acccgcaaga gcgtgcgcat 
721 ggcaacatcc gccaggccca 
781 caggtgatga agaagctggg 
841 gccggcggcg acctggagat 
901 tgcaacacca gcaacctgtt 
961 aacggcaaca gcagcagccc 
1021 tggcagggcg tgggccaggc 
1081 agcaacatca ccggcatcct 
1141 gagaccttcc gccccggcgg 
1201 tacaaggtgg tggagatcaa 
1261 gtgcagcgcg agaagcgcgc 
1321 gccggcagca ccatgggcgc 
1381 agcggcatcg tgcagcagca 
1441 ctgcagctga ccgtgtgggg 
1501 tacctgaagg accagcagct 
1561 accgccgtgc cctggaacag 
1621 atgacctgga tgcagtggga 
1681 ctggaggaca gccagaacca 
1741 tggaacaacc tgtggaactg 



Figure 21 
(Sheet 1 of 1) 



gaagaactgc cagcagtggt 
caacaccgag gacctgtggg 
gaccaccctg ttctgcgcca 
ggccacccac gcctgcgtgc 
gaccgagaac ttcaacatgt 
cagcctgtgg gaccagagcc 
cttcgacccc atccccatcc 
caacaagacc ttcaacggca 
cggcatcaag cccgtggtga 
catcatcatc cgcagcgaga 
cgagagcgtg gagatcaact 
cggccccggc caggccttct 
ctgcaacatc agcaccgacc 
cgagcacttc cccaacaaga 
caccatgcac agcttcaact 
caacagcacc taccacagca 
catcaccctg cagtgcaaga 
cacctacgcc ccccccatcg 
gctgacccgc gacggcggct 
cggcgacatg cgcgacaact 
gcccctgggc atcgccccca 
cgtgggcatc ggcgccgtgt 
cgccagcatc accctgaccg 
gagcaacctg ctgaaggcca 
catcaagcag ctgcaggccc 
gctgggcatc tggggctgca 
cagctggagc aacaagagcg 
ccgcgagatc agcaactaca 
gcaggagaag aacgagaagg 
gttcgacatc agcaactggc 



ggatctgggg catcctgggc 
tgaccgtgta ctacggcgtg 
gcgacgccaa ggcctacgag 
ccaccgaccc caacccccag 
ggaagaacga catggccgac 
tgaagccctg cgtgggcgcc 
actactgcgc ccccgccggc 
ccggcccctg ctacaacgtg 
gcacccagct gctgctgaac 
acctgaccga gaacaccaag 
gcacccgccc caacaacaac 
acgccaccaa cgacgtgatc 
gctggaacaa gaccctgcag 
ccatccagtt caagccccac 
gccgcggcga gttcttctac 
acaacggc^c ctacaagtac 
tcaagcagat cgtgcgcatg 
ccggcaacat cacctgccgc 
tcaacaccac caacaacacc 
ggcgcagcga gctgtacaag 
ccaaggccaa gcgccgcgtg 
tcctgggctt cctgggcgcc 
tgcaggcccg ccagctgctg 
tcgaggccca gcagcacatg 
gcgtgctggc catcgagcgc 
gcggccgcct gatctgcacc 
agaaggacat ctgggacaac 
ccggcctgat ctacaacctg 
acctgctgga gctggacaag 
cctggtacat ctaa 
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Figure 22 
(Sheet 1 of 1) 

gpl4 Omod . TVl . delVlV2 

1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 

61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 

121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 

181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 

241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 

301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 

361 acccccctgt gcgtgggcgc cggcaactgc aacaccagca ccatcaccca ggcctgcccc 

421 aaggtgagct tcgaccccat ccccatccac tactgcgccc ccgccggcta cgccatcctg 

481 aagtgcaaca acaagacctt caacggcacc ggcccctgct acaacgtgag caccgtgcag 

541 tgcacccacg gcatcaagcc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 

601 gaggagggca tcatcatccg cagcgagaac ctgaccgaga acaccaagac catcatcgtg 

661 cacctgaacg agagcgtgga gatcaactgc acccgcccca acaacaacac ccgcaagagc 

721 gtgcgcatcg gccccggcca ggccttctac gccaccaacg acgtgatcgg caacatccgc 

781 caggcccact gcaacatcag caccgaccgc tggaacaaga ccctgcagca ggtgatgaag 

841 aagctgggcg agcacttccc caacaagacc atccagttca agccccacgc cggcggcgac 

901 ctggagatca ccatgcacag cttcaactgc cgcggcgagt tcttctactg caacaccagc 

961 aacctgttca acagcaccta ccacagcaac aacggcacct acaagtacaa cggcaacagc 

1021 agcagcccca tcaccctgca gtgcaagatc aagcagatcg tgcgcatgtg gcagggcgtg 

1081 ggccaggcca cctacgcccc ccccatcgcc ggcaacatca cctgccgcag caacatcacc 

1141 ggcatcctgc tgacccgcga cggcggcttc aacaccacca acaacaccga gaccttccgc 

1201 cccggcggcg gcgacatgcg cgacaactgg cgcagcgagc tgtacaagta caaggtggtg 

1261 gagatcaagc ccctgggcat cgcccccacc aaggccaagc gccgcgtggt gcagcgcgag 

1321 aagcgcgccg tgggcatcgg cgccgtgttc ctgggcttcc tgggcgccgc cggcagcacc 

13 81 atgggcgccg ccagcatcac cctgaccgtg caggcccgcc agctgctgag cggcatcgtg 

1441 cagcagcaga gcaacctgct gaaggccatc gaggcccagc agcacatgct gcagctgacc 

1501 gtgtggggca tcaagcagct gcaggcccgc gtgctggcca tcgagcgcta cctgaaggac 

1561 cagcagctgc tgggcatctg gggctgcagc ggccgcctga tctgcaccac cgccgtgccc 

1621 tggaacagca gctggagcaa caagagcgag aaggacatct gggacaacat gacctggatg 

1681 cagtgggacc gcgagatcag caactacacc ggcctgatct acaacctgct ggaggacagc 

1741 cagaaccagc aggagaagaa cgagaaggac ctgctggagc tggacaagtg gaacaacctg 

1801 tggaactggt tcgacatcag caactggccc tggtacatct aa 
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gpl40mod . TVl . delV2 



1 
61 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 



atgcgcgtga 
ttctggatgc 
cccgtgtggc 
accgaggtgc 
gagatcgtgc 
cagatgcacg 
acccccctgt 
accggcaaca 
aactgcagct 
caggcctgcc 
tacgccatcc 
agcaccgtgc 
ggcagcctgg 
accatcatcg 
acccgcaaga 
ggcaacatcc 
caggtgatga 
gccggcggcg 
tgcaacacca 
aacggcaaca 
tggcagggcg 
agcaacatca 
gagaccttcc 
tacaaggtgg 
gtgcagcgcg 
gccggcagca 
agcggcatcg 
ctgcagctga 
tacctgaagg 
accgccgtgc 
atgacctgga 
ctggaggaca 
tggaacaacc 



tgggcaccca 
tgatgatctg 
gcgacgccaa 
acaacgtgtg 
tgggcaacgt 
aggacgtgat 
gcgtgaccct 
gcaccaacaa 
tcaacgccgg 
ccaaggtgag 
tgaagtgcaa 
agtgcaccca 
ccgaggaggg 
tgcacctgaa 
gcgtgcgcat 
gccaggccca 
agaagctggg 
acctggagat 
gcaacctgtt 
gcagcagccc 
tgggccaggc 
ccggcatcct 
gccccggcgg 
tggagatcaa 
agaagcgcgc 
ccatgggcgc 
tgcagcagca 
ccgtgtgggg 
accagcagct 
cctggaacag 
tgcagtggga 
gccagaacca 
tgtggaactg 



gaagaactgc 
caacaccgag 
gaccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
gaactgcacc 
caccaacggc 
cgccggccgc 
cttcgacccc 
caacaagacc 
cggcatcaag 
catcatcatc 
cgagagcgtg 
cggccccggc 
ctgcaacatc 
cgagcacttc 
caccatgcac 
caacagcacc 
catcaccctg 
cacctacgcc 
gctgacccgc 
cggcgacatg 
gcccctgggc 
cgtgggcatc 
cgccagcatc 
gagcaacctg 
catcaagcag 
gctgggcatc 
cagctggagc 
ccgcgagatc 
gcaggagaag 
gttcgacatc 



cagcagtggt 
gacctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
gacaccaacg 
accggcatct 
ctgatcaact 
atccccatcc 
ttcaacggca 
cccgtggtga 
cgcagcgaga 
gagatcaact 
caggccttct 
agcaccgacc 
cccaacaaga 
agcttcaact 
taccacagca 
cagtgcaaga 
ccccccatcg 
gacggcggct 
cgcgacaact 
atcgccccca 
ggcgccgtgt 
accctgaccg 
ctgaaggcca 
ctgcaggccc 
tggggctgca 
aacaagagcg 
agcaactaca 
aacgagaagg 
agcaactggc 



ggatctgggg 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacga 
tgaagccctg 
tgaccggcaa 
acaacatcga 
gcaacaccag 
actactgcgc 
ccggcccctg 
gcacccagct 
acctgaccga 
gcacccgccc 
acgccaccaa 
gctggaacaa 
ccatccagtt 
gccgcggcga 
acaacggcac 
tcaagcagat 
ccggcaacat 
tcaacaccac 
ggcgcagcga 
ccaaggccaa 
tcctgggctt 
tgcaggcccg 
tcgaggccca 
gcgtgctggc 
gcggccgcct 
agaaggacat 
ccggcctgat 
acctgctgga 
cctggtacat 



catcctgggc 
ctacggcgtg 
ggcctacgag 
caacccccag 
catggccgac 
cgtgaagctg 
ccgcaccgtg 
ggagatgaag 
caccatcacc 
ccccgccggc 
ctacaacgtg 
gctgctgaac 
gaacaccaag 
caacaacaac 
cgacgtgatc 
gaccctgcag 
caagccccac 
gttcttctac 
ctacaagtac 
cgtgcgcatg 
cacctgccgc 
caacaacacc 
gctgtacaag 
gcgccgcgtg 
cctgggcgcc 
ccagctgctg 
gcagcacatg 
catcgagcgc 
gatctgcacc 
ctgggacaac 
ctacaacctg 
gctggacaag 
ctaa 
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1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 
61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 
121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 
301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 
361 acccccctgt gcgtgaccct gaactgcacc gacaccaacg tgaccggcaa ccgcaccgtg 
421 accggcaaca gcaccaacaa caccaacggc accggcatct acaacatcga ggagatgaag 
481 aactgcagct tcaacgccac caccgagctg cgcgacaaga agcacaagga gtacgccctg 
541 ttctaccgcc tggacatcgt gcccctgaac gagaacagcg acaacttcac ctaccgcctg 
601 atcaactgca acaccagcac catcacccag gcctgcccca aggtgagctt cgaccccatc 
661 cccatccact actgcgcccc cgccggctac gccatcctga agtgcaacaa caagaccttc 
721 aacggcaccg gcccctgcta caacgtgagc accgtgcagt gcacccacgg catcaagccc 
781 gtggtgagca cccagctgct gctgaacggc agcctggccg aggagggcat catcatccgc 
841 agcgagaacc tgaccgagaa caccaagacc atcatcgtgc acctgaacga gagcgtggag 
901 atcaactgca cccgccccaa caacaacacc cgcaagagcg tgcgcatcgg ccccggccag 
961 gccttctacg ccaccaacga cgtgatcggc aacatccgcc aggcccactg caacatcagc 
1021 accgaccgct ggaacaagac cctgcagcag gtgatgaaga agctgggcga gcacttcccc 
1081 aacaagacca tccagttcaa gccccacgcc ggcggcgacc tggagatcac catgcacagc 
1141 ttcaactgcc gcggcgagtt cttctactgc aacaccagca acctgttcaa cagcacctac 
12 01 cacagcaaca acggcaccta caagtacaac ggcaacagca gcagccccat caccctgcag 

12 61 tgcaagatca agcagatcgt gcgcatgtgg cagggcgtgg gccaggccac ctacgccccc 
1321 cccatcgccg gcaacatcac ctgccgcagc aacatcaccg gcatcctgct gacccgcgac 

13 81 ggcggcttca acaccaccaa caacaccgag accttccgcc ccggcggcgg cgacatgcgc 
1441 gacaactggc gcagcgagct gtacaagtac aaggtggtgg agatcaagcc cctgggcatc 
1501 gcccccacca aggccatcag cagcgtggtg cagagcgaga agagcgccgt gggcatcggc 
1561 gccgtgttcc tgggcttcct gggcgccgcc ggcagcacca tgggcgccgc cagcatcacc 
1621 ctgaccgtgc aggcccgcca gctgctgagc ggcatcgtgc agcagcagag caacctgctg 
1681 aaggccatcg aggcccagca gcacatgctg cagctgaccg tgtggggcat caagcagctg 
1741 caggcccgcg tgctggccat cgagcgctac ctgaaggacc agcagctgct gggcatctgg 
1801 ggctgcagcg gccgcctgat ctgcaccacc gccgtgccct ggaacagcag ctggagcaac 
1861 aagagcgaga aggacatctg ^ggacaacatg acctggatgc agtgggaccg cgagatcagc 
1921 aactacaccg gcctgatcta caacctgctg gaggacagcc agaaccagca ggagaagaac 
1981 gagaaggacc tgctggagct ggacaagtgg aacaacctgt ggaactggtt cgacatcagc 
2041 aactggccct ggtacatcta a 
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1 atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt 
61 tcgcccagca acaccgagga cctgtgggtg accgtgtact acggcgtgcc cgtgtggcgc 
121 gacgccaaga ccaccctgtt ctgcgccagc gacgccaagg cctacgagac cgaggtgcac 
181 aacgtgtggg ccacccacgc ctgcgtgccc accgacccca acccccagga gatcgtgctg 
241 ggcaacgtga ccgagaactt caacatgtgg aagaacgaca tggccgacca gatgcacgag 
301 gacgtgatca gcctgtggga ccagagcctg aagccctgcg tgaagctgac ccccctgtgc 
361 gtgaccctga actgcaccga caccaacgtg accggcaacc gcaccgtgac cggcaacagc 
421 accaacaaca ccaacggcac cggcatctac aacatcgagg agatgaagaa ctgcagcttc 
481 aacgccacca ccgagctgcg cgacaagaag cacaaggagt acgccctgtt ctaccgcctg 
541 gacatcgtgc ccctgaacga gaacagcgac aacttcacct accgcctgat caactgcaac 
601 accagcacca tcacccaggc ctgccccaag gtgagcttcg accccatccc catccactac 
661 tgcgcccccg ccggctacgc catcctgaag tgcaacaaca agaccttcaa cggcaccggc 
721 ccctgctaca acgtgagcac cgtgcagtgc acccacggca tcaagcccgt ggtgagcacc 
7 81 cagctgctgc tgaacggcag cctggccgag gagggcatca tcatccgcag cgagaacctg 
841 accgagaaca ccaagaccat catcgtgcac ctgaacgaga gcgtggagat caactgcacc 
901 cgccccaaca acaacacccg caagagcgtg cgcatcggcc ccggccaggc cttctacgcc 
9 61 accaacgacg tgatcggcaa catccgccag gcccactgca acatcagcac cgaccgctgg 
1021 aacaagaccc tgcagcaggt gatgaagaag ctgggcgagc acttccccaa caagaccatc 
1081 cagttcaagc cccacgccgg cggcgacctg gagatcacca tgcacagctt caactgccgc 
1141 ggcgagttct tctactgcaa caccagcaac ctgttcaaca gcacctacca cagcaacaac 
1201 ggcacctaca agtacaacgg caacagcagc agccccatca ccctgcagtg caagatcaag 
1261 cagatcgtgc gcatgtggca gggcgtgggc caggccacct acgccccccc catcgccggc 
1321 aacatcacct gccgcagcaa catcaccggc atcctgctga cccgcgacgg cggcttcaac 
1381 accaccaaca acaccgagac cttccgcccc ggcggcggcg acatgcgcga caactggcgc 
1441 agcgagctgt acaagtacaa ggtggtggag atcaagcccc tgggcatcgc ccccaccaag 
1501 gccaagcgcc gcgtggtgca gcgcgagaag cgcgccgtgg gcatcggcgc cgtgttcctg 

15 61 ggcttcctgg gcgccgccgg cagcaccatg ggcgccgcca gcatcaccct gaccgtgcag 
1621 gcccgccagc tgctgagcgg catcgtgcag cagcagagca acctgctgaa ggccatcgag 

16 81 gcccagcagc acatgctgca gctgaccgtg tggggcatca agcagctgca ggcccgcgtg 
1741 ctggccatcg agcgctacct gaaggaccag cagctgctgg gcatctgggg ctgcagcggc 
1801 cgcctgatct gcaccaccgc cgtgccctgg aacagcagct ggagcaacaa gagcgagaag 
1861 gacatctggg acaacatgac ctggatgcag tgggaccgcg agatcagcaa ctacaccggc 
1921 ctgatctaca acctgctgga ggacagccag aaccagcagg agaagaacga gaaggacctg 
1981 ctggagctgg acaagtggaa caacctgtgg aactggttcg acatcagcaa ctggccctgg 
2041 tacatctaa 
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1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 

61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 

121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 

181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 

241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 

301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 

361 acccccctgt gcgtgaccct gaactgcacc gacaccaacg tgaccggcaa ccgcaccgtg 

4 21 accggcaaca gcaccaacaa caccaacggc accggcatct acaacatcga ggagatgaag 

481 aactgcagct tcaacgccac caccgagctg cgcgacaaga agcacaagga gtacgccctg 

541 ttctaccgcc tggacatcgt gcccctgaac gagaacagcg acaacttcac ctaccgcctg 

601 atcaactgca acaccagcac catcacccag gcctgcccca aggtgagctt cgaccccatc 

661 cccatccact actgcgcccc cgccggctac gccatcctga agtgcaacaa caagaccttc 

721 aacggcaccg gcccctgcta caacgtgagc accgtgcagt gcacccacgg catcaagccc 

781 gtggtgagca cccagctgct gctgaacggc agcctggccg aggagggcat catcatccgc 

841 agcgagaacc tgaccgagaa caccaagacc atcatcgtgc acctgaacga gagcgtggag 

901 atcaactgca cccgccccaa caacaacacc cgcaagagcg tgcgcatcgg ccccggccag 

961 gccttctacg ccaccaacga cgtgatcggc aacatccgcc aggcccactg caacatcagc 

1021 accgaccgct ggaacaagac cctgcagcag gtgatgaaga agctgggcga gcacttcccc 

1081 aacaagacca tccagttcaa gccccacgcc ggcggcgacc tggagatcac catgcacagc 

1141 ttcaactgcc gcggcgagtt cttctactgc aacaccagca acctgttcaa cagcacctac 

1201 cacagcaaca acggcaccta caagtacaac ggcaacagca gcagccccat caccctgcag 

1261 tgcaagatca agcagatcgt gcgcatgtgg cagggcgtgg gccaggccac ctacgccccc 

13 21 cccatcgccg gcaacatcac ctgccgcagc aacatcaccg gcatcctgct gacccgcgac 

13 81 ggcggcttca acaccaccaa caacaccgag accttccgcc ccggcggcgg cgacatgcgc 

1441 gacaactggc gcagcgagct gtacaagtac aaggtggtgg agatcaagcc cctgggcatc 

1501 gcccccacca aggccaagcg ccgcgtggtg cagcgcgaga agcgcgccgt gggcatcggc 

1561 gccgtgttcc tgggcttcct gggcgccgcc ggcagcacca tgggcgccgc cagcatcacc 

1621 ctgaccgtgc aggcccgcca gctgctgagc ggcatcgtgc agcagcagag caacctgctg 

1681 aaggccatcg aggcccagca gcacatgctg cagctgaccg tgtggggcat caagcagctg 

1741 caggcccgcg tgctggccat cgagcgctac ctgaaggacc agcagctgct gggcatctgg 

1801 ggctgcagcg gccgcctgat ctgcaccacc gccgtgccct ggaacagcag ctggagcaac 

1861 aagagcgaga aggacatctg ggacaacatg acctggatgc agtgggaccg cgagatcagc 

1921 aactacaccg gcctgatcta caacctgctg gaggacagcc agaaccagca ggagaagaac 

1981 gagaaggacc tgctggagct ggacaagtgg aacaacctgt ggaactggtt cgacatcagc 

2041 aactggccct ggtacatcaa gatcttcatc atgatcgtgg gcggcctgat cggcctgcgc 

2101 atcatcttcg ccgtgctgag catcgtg 
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1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 
61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 
121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 
3 01 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgggcgcc 
361 ggcgcctgcc ccaaggtgag cttcgacccc atccccatcc actactgcgc ccccgccggc 
421 tacgccatcc tgaagtgcaa caacaagacc ttcaacggca ccggcccctg ctacaacgtg 
481 agcaccgtgc agtgcaccca cggcatcaag cccgtggtga gcacccagct gctgctgaac 
541 ggcagcctgg ccgaggaggg catcatcatc cgcagcgaga acctgaccga gaacaccaag 
601 accatcatcg tgcacctgaa cgagagcgtg gagatcaact gcacccgccc caacaacaac 
661 acccgcaaga gcgtgcgcat cggccccggc caggccttct acgccaccaa cgacgtgatc 
721 ggcaacatcc gccaggccca ctgcaacatc agcaccgacc gctggaacaa gaccctgcag 
781 caggtgatga agaagctggg cgagcacttc cccaacaaga ccatccagtt caagccccac 
841 gccggcggcg acctggagat caccatgcac agcttcaact gccgcggcga gttcttctac 
901 tgcaacacca gcaacctgtt caacagcacc taccacagca acaacggcac ctacaagtac 
961 aacggcaaca gcagcagccc catcaccctg cagtgcaaga tcaagcagat cgtgcgcatg 
1021 tggcagggcg tgggccaggc cacctacgcc ccccccatcg ccggcaacat cacctgccgc 
1081 agcaacatca ccggcatcct gctgacccgc gacggcggct tcaacaccac caacaacacc 
1141 gagaccttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 
1201 tacaaggtgg tggagatcaa gcccctgggc atcgccccca ccaaggccaa gcgccgcgtg 

12 61 gtgcagcgcg agaagcgcgc cgtgggcatc ggcgccgtgt tcctgggctt cctgggcgcc 
1321 gccggcagca ccatgggcgc cgccagcatc accctgaccg tgcaggcccg ccagctgctg 

13 81 agcggcatcg tgcagcagca gagcaacctg ctgaaggcca tcgaggccca gcagcacatg 
1441 ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc catcgagcgc 
1501 tacctgaagg accagcagct gctgggcatc tggggctgca gcggccgcct gatctgcacc 
1561 accgccgtgc cctggaacag cagctggagc aacaagagcg agaaggacat ctgggacaac 
1621 atgacctgga tgcagtggga ccgcgagatc agcaactaca ccggcctgat ctacaacctg 
1681 ctggaggaca gccagaacca gcaggagaag aacgagaagg acctgctgga gctggacaag 
1741 tggaacaacc tgtggaactg gttcgacatc agcaactggc cctggtacat caagatcttc 
1801 atcatgatcg tgggcggcct gatcggcctg cgcatcatct tcgccgtgct gagcatcgtg 
1861 aaccgcgtgc gccagggcta cagccccctg agcttccaga ccctgacccc cagcccccgc 
1921 ggcctggacc gcctgggcgg catcgaggag gagggcggcg agcaggaccg cgaccgcagc 
1981 atccgcctgg tgagcggctt cctgagcctg gcctgggacg acctgcgcaa cctgtgcctg 
2041 ttcagctacc accgcctgcg cgacttcatc ctgatcgccg tgcgcgccgt ggagctgctg 
2101 ggccacagca gcctgcgcgg cctgcagcgc ggctgggaga tcctgaagta cctgggcagc 
2161 ctggtgcagt actggggcct ggagctgaag aagagcgcca tcagcctgct ggacaccatc 
2221 gccatcaccg tggccgaggg caccgaccgc atcatcgagc tggtgcagcg catctgccgc 
2281 gccatcctga acatcccccg ccgcatccgc cagggcttcg aggccgccct gctgtaa 
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1 atgcgcgtga tgggcaccca 
61 ttctggatgc tgatgatctg 
121 cccgtgtggc gcgacgccaa 
181 accgaggtgc acaacgtgtg 
241 gagatcgtgc tgggcaacgt 
3 01 cagatgcacg aggacgtgat 
361 acccccctgt gcgtgggcgc 
421 aaggtgagct tcgaccccat 
481 aagtgcaaca acaagacctt 
541 tgcacccacg gcatcaagcc 
601 gaggagggca tcatcatccg 
661 cacctgaacg agagcgtgga 
721 gtgcgcatcg gccccggcca 
7 81 caggcccact gcaacatcag 
841 aagctgggcg agcacttccc 
901 ctggagatca ccatgcacag 
961 aacctgttca acagcaccta 
1021 agcagcccca tcaccctgca 
1081 ggccaggcca cctacgcccc 
1141 ggcatcctgc tgacccgcga 
1201 cccggcggcg gcgacatgcg 
1261 gagatcaagc ccctgggcat 
1321 aagcgcgccg tgggcatcgg 
1381 atgggcgccg ccagcatcac 
1441 cagcagcaga gcaacctgct 
1501 gtgtggggca tcaagcagct 
1561 cagcagctgc tgggcatctg 
1621 tggaacagca gctggagcaa 
1681 cagtgggacc gcgagatcag 
1741 cagaaccagc aggagaagaa 
1801 tggaactggt tcgacatcag 
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gaagaactgc cagcagtggt 
caacaccgag gacctgtggg 
gaccaccctg ttctgcgcca 
ggccacccac gcctgcgtgc 
gaccgagaac ttcaacatgt 
cagcctgtgg gaccagagcc 
cggcaactgc aacaccagca 
ccccatccac tactgcgccc 
caacggcacc ggcccctgct 
cgtggtgagc acccagctgc 
cagcgagaac ctgaccgaga 
gatcaactgc acccgcccca 
ggccttctac gccaccaacg 
caccgaccgc tggaacaaga 
caacaagacc atccagttca 
cttcaactgc cgcggcgagt 
ccacagcaac aacggcacct 
gtgcaagatc aagcagatcg 
ccccatcgcc ggcaacatca 
cggcggcttc aacaccacca 
cgacaactgg cgcagcgagc 
cgcccccacc aaggccaagc 
cgccgtgttc ctgggcttcc 
cctgaccgtg caggcccgcc 
gaaggccatc gaggcccagc 
gcaggcccgc gtgctggcca 
gggctgcagc ggccgcctga 
caagagcgag aaggacatct 
caactacacc ggcctgatct 
cgagaaggac ctgctggagc 
caactggccc tggtacatct 
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ggatctgggg catcctgggc 
tgaccgtgta ctacggcgtg 
gcgacgccaa ggcctacgag 
ccaccgaccc caacccccag 
ggaagaacga catggccgac 
tgaagccctg cgtgaagctg 
ccatcaccca ggcctgcccc 
ccgccggcta cgccatcctg 
acaacgtgag caccgtgcag 
tgctgaacgg cagcctggcc 
acaccaagac catcatcgtg 
acaacaacac ccgcaagagc 
acgtgatcgg caacatccgc 
ccctgcagca ggtgatgaag 
agccccacgc cggcggcgac 
tcttctactg caacaccagc 
acaagtacaa cggcaacagc 
tgcgcatgtg gcagggcgtg 
cctgccgcag caacatcacc 
acaacaccga gaccttccgc 
tgtacaagta caaggtggtg 
gccgcgtggt gcagcgcgag 
tgggcgccgc cggcagcacc 
agctgctgag cggcatcgtg 
agcacatgct gcagctgacc 
tcgagcgcta cctgaaggac 
tctgcaccac cgccgtgccc 
gggacaacat gacctggatg 
acaacctgct ggaggacagc 
tggacaagtg gaacaacctg 
aa 
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1 

61 
121 
181 
241 
301 
361 
421 
481 
541 
601 
661 
721 
781 
841 
901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 
2461 



atgcgcgtga 
ttctggatgc 
cccgtgtggc 
accgaggtgc 
gagatcgtgc 
cagatgcacg 
acccccctgt 
accggcaaca 
aactgcagct 
caggcctgcc 
tacgccatcc 
agcaccgtgc 
ggcagcctgg 
accatcatcg 
acccgcaaga 
ggcaacatcc 
caggtgatga 
gccggcggcg 
tgcaacacca 
aacggcaaca 
tggcagggcg 
agcaacatca 
gagaccttcc 
tacaaggtgg 
gtgcagcgcg 
gccggcagca 
agcggcatcg 
ctgcagctga 
tacctgaagg 
accgccgtgc 
atgacctgga 
ctggaggaca 
tggaacaacc 
atcatgatcg 
aaccgcgtgc 
ggcctggacc 
atccgcctgg 
ttcagctacc 
ggccacagca 
ctggtgcagt 
gccatcaccg 
gccatcctga 



tgggcaccca 
tgatgatctg 
gcgacgccaa 
acaacgtgtg 
tgggcaacgt 
aggacgtgat 
gcgtgaccct 
gcaccaacaa 
tcaacgccgg 
ccaaggtgag 
tgaagtgcaa 
agtgcaccca 
ccgaggaggg 
tgcacctgaa 
gcgtgcgcat 
gccaggccca 
agaagctggg 
acctggagat 
gcaacctgtt 
gcagcagccc 
tgggccaggc 
ccggcatcct 
gccccggcgg 
tggagatcaa 
agaagcgcgc 
ccatgggcgc 
tgcagcagca 
ccgtgtgggg 
accagcagct 
cctggaacag 
tgcagtggga 
gccagaacca 
tgtggaactg 
tgggcggcct 
gccagggcta 
gcctgggcgg 
tgagcggctt 
accgcctgcg 
gcctgcgcgg 
actggggcct 
tggccgaggg 
acatcccccg 



gaagaactgc 
caacaccgag 
gaccaccctg 
ggccacccac 
gaccgagaac 
cagcctgtgg 
gaactgcacc 
caccaacggc 
cgccggccgc 
cttcgacccc 
caacaagacc 
cggcatcaag 
catcatcatc 
cgagagcgtg 
cggccccggc 
ctgcaacatc 
cgagcacttc 
caccatgcac 
caacagcacc 
catcaccctg 
cacctacgcc 
gctgacccgc 
cggcgacatg 
gcccctgggc 
cgtgggcatc 
cgccagcatc 
gagcaacctg 
catcaagcag 
gctgggcatc 
cagctggagc 
ccgcgagatc 
gcaggagaag 
gttcgacatc 
gatcggcctg 
cagccccctg 
catcgaggag 
cctgagcctg 
cgacttcatc 
cctgcagcgc 
ggagctgaag 
caccgaccgc 
ccgcatccgc 



cagcagtggt 
gacctgtggg 
ttctgcgcca 
gcctgcgtgc 
ttcaacatgt 
gaccagagcc 
gacaccaacg 
accggcatct 
ctgatcaact 
atccccatcc 
ttcaacggca 
cccgtggtga 
cgcagcgaga 
gagatcaact 
caggccttct 
agcaccgacc 
cccaacaaga 
agcttcaact 
taccacagca 
cagtgcaaga 
ccccccatcg 
gacggcggct 
cgcgacaact 
atcgccccca 
ggcgccgtgt 
accctgaccg 
ctgaaggcca 
ctgcaggccc 
tggggctgca 
aacaagagcg 
agcaactaca 
aacgagaagg 
agcaactggc 
cgcatcatct 
agcttccaga 
gagggcggcg 
gcctgggacg 
ctgatcgccg 
ggctgggaga 
aagagcgcca 
atcatcgagc 
cagggcttcg 



ggatctgggg 
tgaccgtgta 
gcgacgccaa 
ccaccgaccc 
ggaagaacga 
tgaagccctg 
tgaccggcaa 
acaacatcga 
gcaacaccag 
actactgcgc 
ccggcccctg 
gcacccagct 
acctgaccga 
gcacccgccc 
acgccaccaa 
gctggaacaa 
ccatccagtt 
gccgcggcga 
acaacggcac 
tcaagcagat 
ccggcaacat 
tcaacaccac 
ggcgcagcga 
ccaaggccaa 
tcctgggctt 
tgcaggcccg 
tcgaggccca 
gcgtgctggc 
gcggccgcct 
agaaggacat 
ccggcctgat 
acctgctgga 
cctggtacat 
tcgccgtgct 
ccctgacccc 
agcaggaccg 
acctgcgcaa 
tgcgcgccgt 
tcctgaagta 
tcagcctgct 
tggtgcagcg 
aggccgccct 



catcctgggc 
ctacggcgtg 
ggcctacgag 
caacccccag 
catggccgac 
cgtgaagctg 
ccgcaccgtg 
ggagatgaag 
caccatcacc 
ccccgccggc 
ctacaacgtg 
gctgctgaac 
gaacaccaag 
caacaacaac 
cgacgtgatc 
gaccctgcag 
caagccccac 
gttcttctac 
ctacaagtac 
cgtgcgcatg 
cacctgccgc 
caacaacaac 
gctgtacaag 
gcgccgcgtg 
cctgggcgcc 
ccagctgctg 
gcagcacatg 
catcgagcgc 
gatctgcacc 
ctgggacaac 
ctacaacctg 
gctggacaag 
caagatcttc 
gagcatcgtg 
cagcccccgc 
cgaccgcagc 
cctgtgcctg 
ggagctgctg 
cctgggcagc 
ggacaccatc 
catctgccgc 
gctgtaa 
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1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 
61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 
121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 
301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 
361 acccccctgt gcgtgggcgc cggcaactgc agcttcaacg ccaccaccga gctgcgcgac 
421 aagaagcaca aggagtacgc cctgttctac cgcctggaca tcgtgcccct gaacgagaac 
481 agcgacaact tcacctaccg cctgatcaac tgcaacacca gcaccatcac ccaggcctgc 
541 cccaaggtga gcttcgaccc catccccatc cactactgcg cccccgccgg ctacgccatc 
601 ctgaagtgca acaacaagac cttcaacggc accggcccct gctacaacgt gagcaccgtg 
661 cagtgcaccc acggcatcaa gcccgtggtg agcacccagc tgctgctgaa cggcagcctg 
721 gccgaggagg gcatcatcat ccgcagcgag aacctgaccg agaacaccaa gaccatcatc 
781 gtgcacctga acgagagcgt ggagatcaac tgcacccgcc ccaacaacaa cacccgcaag 
841 agcgtgcgca tcggccccgg ccaggccttc tacgccacca acgacgtgat cggcaacatc 
901 cgccaggccc actgcaacat cagcaccgac cgctggaaca agaccctgca gcaggtgatg 
961 aagaagctgg gcgagcactt ccccaacaag accatccagt tcaagcccca cgccggcggc 
1021 gacctggaga tcaccatgca cagcttcaac tgccgcggcg agttcttcta ctgcaacacc 
1081 agcaacctgt tcaacagcac ctaccacagc aacaacggca cctacaagta caacggcaac 
1141 agcagcagcc ccatcaccct. gcagtgcaag atcaagcaga tcgtgcgcat gtggcagggc 
1201 gtgggccagg ccacctacgc cccccccatc gccggcaaca tcacctgccg cagcaacatc 
1261 accggcatcc tgctgacccg cgacggcggc ttcaacacca ccaacaacac cgagaccttc 
1321 cgccccggcg gcggcgacat gcgcgacaac tggcgcagcg agctgtacaa gtacaaggtg 
1381 gtggagatca agcccctggg catcgccccc accaaggcca agcgccgcgt ggtgcagcgc 
1441 gagaagcgcg ccgtgggcat cggcgccgtg ttcctgggct tcctgggcgc cgccggcagc 
1501 accatgggcg ccgccagcat caccctgacc gtgcaggccc gccagctgct gagcggcatc 
1561 gtgcagcagc agagcaacct gctgaaggcc atcgaggccc agcagcacat gctgcagctg 
1621 accgtgtggg gcatcaagca gctgcaggcc cgcgtgctgg ccatcgagcg ctacctgaag 
1681 gaccagcagc tgctgggcat ctggggctgc agcggccgcc tgatctgcac caccgccgtg 
1741 ccctggaaca gcagctggag caacaagagc gagaaggaca tctgggacaa catgacctgg 
1801 atgcagtggg accgcgagat cagcaactac accggcctga tctacaacct gctggaggac 
1861 agccagaacc agcaggagaa gaacgagaag gacctgctgg agctggacaa gtggaacaac 
1921 ctgtggaact ggttcgacat cagcaactgg ccctggtaca tcaagatctt catcatgatc 
1981 gtgggcggcc tgatcggcct gcgcatcatc ttcgccgtgc tgagcatcgt gaaccgcgtg 
2041 cgccagggct acagccccct gagcttccag accctgaccc ccagcccccg cggcctggac 
2101 cgcctgggcg gcatcgagga ggagggcggc gagcaggacc gcgaccgcag catccgcctg 
2161 gtgagcggct tcctgagcct ggcctgggac gacctgcgca acctgtgcct gttcagctac 
2221 caccgcctgc gcgacttcat cctgatcgcc gtgcgcgccg tggagctgct gggccacagc 
2281 agcctgcgcg gcctgcagcg cggctgggag atcctgaagt acctgggcag cctggtgcag 
2341 tactggggcc tggagctgaa gaagagcgcc atcagcctgc tggacaccat cgccatcacc 
2401 gtggccgagg gcaccgaccg catcatcgag ctggtgcagc gcatctgccg cgccatcctg 
2461 aacatccccc gccgcatccg ccagggcttc gaggccgccc tgctgtaa 
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1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 
61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 
121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 
301 c^agatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 
361 acccccctgt gcgtgggcgc cggcaactgc agcttcaacg ccaccaccga gctgcgcgac 
421 aagaagcaca aggagtacgc cctgttctac cgcctggaca tcgtgcccct gaacgagaac 
4 81 agcgacaact tcacctaccg cctgatcaac tgcaacacca gcaccatcac ccaggcctgc 
541 cccaaggtga gcttcgaccc catccccatc cactactgcg cccccgccgg ctacgccatc 
601 ctgaagtgca acaacaagac cttcaacggc accggcccct gctacaacgt gagcaccgtg 
661 cagtgcaccc acggcatcaa gcccgtggtg agcacccagc tgctgctgaa cggcagcctg 
721 gccgaggagg gcatcatcat ccgcagcgag aacctgaccg agaacaccaa gaccatcatc 
781 gtgcacctga acgagagcgt ggagatcaac tgcacccgcc ccaacaacaa cacccgcaag 
841 agcgtgcgca tcggccccgg ccaggccttc tacgccacca acgacgtgat cggcaacatc 
901 cgccaggccc actgcaacat cagcaccgac cgctggaaca agaccctgca gcaggtgatg 
961 aagaagctgg gcgagcactt ccccaacaag accatccagt tcaagcccca cgccggcggc 
1021 gacctggaga tcaccatgca cagcttcaac tgccgcggcg agttcttcta ctgcaacacc 
1081 agcaacctgt tcaacagcac ctaccacagc aacaacggca cctacaagta caacggcaac 
1141 agcagcagcc ccatcaccct gcagtgcaag atcaagcaga tcgtgcgcat gtggcagggc 
1201 gtgggccagg ccacctacgc cccccccatc gccggcaaca tcacctgccg cagcaacatc 
1261 accggcatcc tgctgacccg cgacggcggc ttcaacacca ccaacaacac cgagaccttc 
1321 cgccccggcg gcggcgacat gcgcgacaac tggcgcagcg agctgtacaa gtacaaggtg 
13 81 gtggagatca agcccctggg catcgccccc accaaggcca agcgccgcgt ggtgcagcgc 
1441 gagaagcgcg ccgtgggcat cggcgccgtg ttcctgggct tcctgggcgc cgccggcagc 
1501 accatgggcg ccgccagcat caccctgacc gtgcaggccc gccagctgct gagcggcatc 
1561 gtgcagcagc agagcaacct gctgaaggcc atcgaggccc agcagcacat gctgcagctg 
1621 accgtgtggg gcatcaagca gctgcaggcc cgcgtgctgg ccatcgagcg ctacctgaag. 
1681 gaccagcagc tgctgggcat ctggggctgc agcggccgcc tgatctgcac caccgccgtg 
1741 ccctggaaca gcagctggag caacaagagc gagaaggaca tctgggacaa catgacctgg 
1801 atgcagtggg accgcgagat cagcaactac aqcggcctga tctacaacct gctggaggac 
1861 agccagaacc agcaggagaa gaacgagaag gacctgctgg agctggacaa gtggaacaac 
1921 ctgtggaact ggttcgacat cagcaactgg ccctggtaca tcaagatctt catcatgatc 
1981 gtgggcggcc tgatcggcct gcgcatcatc ttcgccgtgc tgagcatcgt gaaccgcgtg 
2041 cgccagggct acagccccct gagcttccag accctgaccc ccagcccccg cggcctggac 
2101 cgcctgggcg gcatcgagga ggagggcggc gagcaggacc gcgaccgcag catccgcctg 
2161 gtgagcggct tcctgagcct ggcctgggac gacctgcgca acctgtgcct gttcagctac 
2221 caccgcctgc gcgacttcat cctgatcgcc gtgcgcgccg tggagctgct gggccacagc 

22 81 agcctgcgcg gcctgcagcg cggctgggag atcctgaagt acctgggcag cctggtgcag 

23 41 tactggggcc tggagctgaa gaagagcgcc atcagcctgc tggacaccat cgccatcacc 
2401 gtggccgagg gcaccgaccg catcatcgag ctggtgcagc gcatctgccg cgccatcctg 
2461 aacatccccc gccgcatccg ccagggcttc gaggccgccc tgctgtaact cgagcaagtc 
2521 tagagggaga ccacaacggt ttccctctag cgggatcaat tccgcccccc cccctaacgt 
2581 tactggccga agccgcttgg aataaggccg gtgtgcgttt gtctatatgt tattttccac 
2641 catattgccg tcttttggca atgtgagggc ccggaaacct ggccctgtct tcttgacgag 
2701 cattcctagg ggtctttccc ctctcgccaa aggaatgcaa ggtctgttga atgtcgtgaa 
2761 ggaagcagtt cctctggaag cttcttgaag acaaacaacg tctgtagcga ccctttgcag 
2821 gcagcggaac cccccacctg gcgacaggtg cctctgcggc caaaagccac gtgtataaga 
2881 tacacctgca aaggcggcac aaccccagtg ccacgttgtg agttggatag ttgtggaaag 
2941 agtcaaatgg ctctcctcaa gcgtattcaa caaggggctg aaggatgccc agaaggtacc 
3001 ccattgtatg ggatctgatc tggggcctcg gtgcacatgc tttacatgtg tttagtcgag 
3061 gttaaaaaac gtctaggccc cccgaaccac ggggacgtgg ttttcctttg aaaaacacga 
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3181 catccgcctg cgccccggcg gcaagaagtg ctacatgatg a&a^cctgg tgtgggccag 
3241 ccgcgagctg gagaagttcg ccctgaaccc cggcctgctg gagaccagcg agggctgcaa 
3301 gcagatcatc cgccagctgc accccgccct gcagaccggc agcgaggagc tgaagagcct 
33 61 gttcaacacc gtggccaccc tgtactgcgt gcacgagaag atcgaggtcc gcgacaccaa 
3 421 ggaggccctg gacaagatcg aggaggagca gaacaagtgc cagcagaaga tccagcaggc 
3 4 81 cgaggccgcc gacaagggca aggtgagcca gaactacccc atcgtgcaga acctgcaggg 
3 541 ccagatggtg caccaggcca tcagcccccg caccctgaac gcctgggtga aggtgatcga 
3601 ggagaaggcc ttcagccccg aggtgatccc catgttcacc gccctgagcg agggcgccac 
3661 cccccaggac ctgaacacga tgttgaacac cgtgggcggc caccaggccg ccatgcagat 
3721 gctgaaggac accatcaacg aggaggccgc cgagtgggac cgcgtgcacc ccgtgcacgc 
3781 cggccccatc gcccccggcc agatgcgcga gccccgcggc agcgacatcg ccggcaccac 
3841 cagcaccctg caggagcaga tcgcctggat gaccagcaac ccccccatcc ccgtgggcga 
3901 catctacaag cggtggatca tcctgggcct gaacaagatc gtgcggatgt acagccccgt 
3961 gagcatcctg gacatcaagc agggccccaa ggagcccttc cgcgactacg tggaccgctt 
4021 cttcaagacc ctgcgcgccg agcagagcac ccaggaggtg aagaactgga tgaccgacac 
4081 cctgctggtg cagaacgcca accccgactg caagaccatc ctgcgcgctc tcggccccgg 
4141 cgccagcctg gaggagatga tgaccgcctg ccagggcgtg ggcggcccca gccacaaggc 
4201 ccgcgtgctg gccgaggcga tgagccaggc caacaccagc gtgatgatgc agaagagcaa 
4261 cttcaagggc ccccggcgca tcgtcaagtg cttcaactgc ggcaaggagg gccacatcgc 
4321 ccgcaactgc cgcgcccccc gcaagaaggg ctgctggaag tgcggcaagg agggccacca 
43 81 gatgaaggac tgcaccgagc gccaggccaa cttcctgggc aagatctggc ccagccacaa 
4441 gggccgcccc ggcaacttcc tgcagagccg ccccgagccc accgcccccc ccgccgagag 
4501 cttccgcttc gaggagacca cccccggcca gaagcaggag agcaaggacc gcgagaccct 
4561 gaccagcctg aagagcctgt tcggcaacga ccccctgagc caataa 
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1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 
61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 
121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 
301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 
361 acccccctgt gcgtgggcgc cggcaactgc aacaccagca ccatcaccca ggcctgcccc 
421 aaggtgagct tcgaccccat ccccatccac tactgcgccc ccgccggcta cgccatcctg 
481 aagtgcaaca acaagacctt caacggcacc ggcccctgct acaacgtgag caccgtgcag 
541 tgcacccacg gcatcaagcc cgtggtgagc acccagctgc tgctgaacgg cagcctggcc 
601 gaggagggca tcatcatccg cagcgagaac ctgaccgaga acaccaagac catcatcgtg 
6 61 cacctgaacg agagcgtgga gatcaactgc acccgcccca acaacaacac ccgcaagagc 
721 Qtgcgcatcg gccccggcca ggccttctac gccaccaacg acgtgatcgg caacatccgc 
781 caggcccact gcaacatcag caccgaccgc tggaacaaga ccctgcagca ggtgatgaag 
841 aagctgggcg agcacttccc caacaagacc atccagttca agccccacgc cggcggcgac 
901 ctggagatca ccatgcacag cttcaactgc cgcggcgagt tcttctactg caacaccagc 
9 61 aacctgttca acagcaccta ccacagcaac aacggcacct acaagtacaa cggcaacagc 
1021 agcagcccca tcaccctgca gtgcaagatc aa:gcagatcg tgcgcatgtg gcagggcgtg 
1081 ggccaggcca cctacgcccc ccccatcgcc ggcaacatca cctgccgcag caacatcacc 
1141 ggcatcctgc tgacccgcga cggcggcttc aacaccacca acaacaccga gaccttccgc 
1201 cccggcggcg gcgacatgcg cgacaactgg cgcagcgagc tgtacaagta caaggtggtg 
1261 gagatcaagc ccctgggcat cgcccccacc aaggccaagc gccgcgtggt gcagcgcgag 
1321 aagcgcgccg tgggcatcgg cgccgtgttc ctgggcttcc tgggcgccgc cggcagcacc 
1381 atgggcgccg ccagcatcac cctgaccgtg caggcccgcc agctgctgag cggcatcgtg 
1441 cagcagcaga gcaacctgct gaaggccatc gaggcccagc agcacatgct gcagctgacc 
15 01 gtgtggggca tcaagcagct gcaggcccgc gtgctggcca tcgagcgcta cctgaaggac 
15 61 cagcagctgc tgggcatctg gggctgcagc ggccgcctga tctgcaccac cgccgtgccc 
1621 tggaacagca gctggagcaa caagagcgag aaggacatct gggacaacat gacctggatg 
1681 cagtgggacc gcgagatcag caactacacc ggcctgatct acaacctgct ggaggacagc 
17 41 cagaaccagc aggagaagaa cgagaaggac ctgctggagc tggacaagtg gaacaacctg 
1801 tggaactggt tcgacatcag caactggccc tggtacatca agatcttcat catgatcgtg 
1861 ggcggcctga tcggcctgcg catcatcttc gccgtgctga gcatcgtgaa ccgcgtgcgc 
1921 cagggctaca gccccctgag cttccagacc ctgaccccca gcccccgcgg cctggaccgc 
1981 ctgggcggca tcgaggagga gggcggcgag caggaccgcg accgcagcat ccgcctggtg 
2041 agcggcttcc tgagcctggc ctgggacgac ctgcgcaacc tgtgcctgtt cagctaccac 
2101 cgcctgcgcg acttcatcct gatcgccgtg cgcgccgtgg agctgctggg ccacagcagc 
2161 ctgcgcggcc tgcagcgcgg ctgggagatc ctgaagtacc tgggcagcct ggtgcagtac 
2221 tggggcctgg agctgaagaa gagcgccatc agcctgctgg acaccatcgc catcaccgtg 
2281 gccgagggca ccgaccgcat catcgagctg gtgcagcgca tctgccgcgc catcctgaac 
2341 atcccccgcc gcatccgcca gggcttcgag gccgccctgc tgtaactcga gcaagtctag 
2401 agggagacca caacggtttc cctctagcgg gatcaattcc gccccccccc ctaacgttac 
2461 tggccgaagc cgcttggaat aaggccggtg tgcgtttgtc tatatgttat tttccaccat 
2521 attgccgtct tttggcaatg tgagggcccg gaaacctggc cctgtcttct tgacgagcat 
2 581 tcctaggggt ctttcccctc tcgccaaagg aatgcaaggt ctgttgaatg tcgtgaagga 
2 641 agcagttcct ctggaagctt cttgaagaca aacaacgtct gtagcgaccc tttgcaggca 
2701 gcggaacccc ccacctggcg acaggtgcct ctgcggccaa aagccacgtg tataagatac 
2761 acctgcaaag gcggcacaac cccagtgcca cgttgtgagt tggatagttg tggaaagagt 
2 821 caaatggctc tcctcaagcg tattcaacaa ggggctgaag gatgcccaga aggtacccca 
2881 ttgtatggga tctgatctgg ggcctcggtg cacatgcttt acatgtgttt agtcgaggtt 
2941 aaaaaacgtc taggcccccc gaaccacggg gacgtggttt tcctttgaaa aacacgataa 
3001 taccatgggc gcccgcgcca gcatcctgcg cggcggcaag ctggacgcct gggagcgcat 
3061 ccgcctgcgc cccggcggca agaagtgcta catgatgaag cacctggtgt gggccagccg 
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3181 gatcatccgc cagctgcacc ccgccctgca gaccggcagc gaggagctga agagcctgtt 
3241 caacaccgtg gccaccctgt actgcgtgca cgagaagatc gaggtccgcg acaccaagga 
3301 ggccctggac aagatcgagg aggagcagaa caagtgccag cagaagatcc agcaggccga 
33 61 ggccgccgac aagggcaagg tgagccagaa ctaccccatc gtgcagaacc tgcagggcca 
3421 gatggtgcac caggccatca gcccccgcac cctgaacgcc tgggtgaagg tgatcgagga 
3 481 gaaggccttc agccccgagg tgatccccat gttcaccgcc ctgagcgagg gcgccacccc 
3541 ccaggacctg aacacgatgt tgaacaccgt gggcggccac caggccgcca tgcagatgct 
3601 gaaggacacc atcaacgagg aggccgccga gtgggaccgc gtgcaccccg tgcacgccgg 
3661 ccccatcgcc cccggccaga tgcgcgagcc ccgcggcagc gacatcgccg gcaccaccag 
372i caccctgcag gagcagatcg cctggatgac cagcaacccc cccatccccg tgggcgacat 
3781 ctacaagcgg tggatcatcc tgggcctgaa caagatcgtg cggatgtaca gccccgtgag 
3 841 catcctggac atcaagcagg gccccaagga gcccttccgc gactacgtgg accgcttctt 
3 901 caagaccctg cgcgccgagc agagcaccca ggaggtgaag aactggatga ccgacaccct 
3961 gctggtgcag aacgccaacc ccgactgcaa gaccatcctg cgcgctctcg gccccggcgc 
4021 cagcctggag gagatgatga ccgcctgcca gggcgtgggc ggccccagcc acaaggcccg 
4081 cgtgctggcc gaggcgatga gccaggccaa caccagcgtg atgatgcaga agagcaactt 
4141 caagggcccc cggcgcatcg tcaagtgctt caactgcggc aaggagggcc acatcgcccg 
4201 caactgccgc gccccccgca agaagggctg ctggaagtgc ggcaaggagg gccaccagat 
4261 gaaggactgc accgagcgcc aggccaactt cctgggcaag atctggccca gccacaaggg 
4321 ccgccccggc aacttcctgc agagccgccc cgagcccacc gccccccccg ccgagagctt 
43 81 Gcgcttcgag gagaccaccc ccggccagaa gcaggagagc aaggaccgcg agaccctgac 
4441 cagcctgaag agcctgttcg gcaacgaccc cctgagccaa taa 



44/158 



wo 03/004620 



PCTAJS02/21420 



Figure 33 
(Sheet 1 of 2) 



gpl 6 Omod . TVl . dV2 -gagmod . BW9 6 5 

1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 

61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 

121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 

181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 

241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 

301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 

3 61 acccccctgt gcgtgaccct gaactgcacc gacaccaacg tgaccggcaa ccgcaccgtg 
421 accggcaaca gcaccaacaa caccaacggc accggcatct acaacatcga ggagatgaag 

4 81 aactgcagct tcaacgccgg cgccggccgc ctgatcaact gcaacaccag caccatcacc 
541 caggcctgcc ccaaggtgag cttcgacccc atccccatcc actactgcgc ccccgccggc 
601 tacgccatcc tgaagtgcaa caacaagacc ttcaacggca ccggcccctg ctacaacgtg 
661 agcaccgtgc agtgcaccca cggcatcaag cccgtggtga gcacccagct gctgctgaac 
721 ggcagcctgg ccgaggaggg catcatcatc cgcagcgaga acctgaccga gaacaccaag 
781 accatcatcg tgcacctgaa cgagagcgtg gagatcaact gcacccgccc caacaacaac 
841 acccgcaaga gcgtgcgcat cggccccggc caggccttct acgccaccaa cgacgtgatc 
901 ggcaacatcc gccaggccca ctgcaacatc agcaccgacc gctggaacaa gaccctgcag 
961 caggtgatga agaagctggg cgagcacttc cccaacaaga ccatccagtt caagccccac 

1021 gccggcggcg acctggagat caccatgcac agcttcaact gccgcggcga gttcttctac 
1081 tgcaacacca gcaacctgtt caacagcacc taccacagca acaacggcac ctacaagtac 
1141 aacggcaaca gcagcagccc catcaccctg cagtgcaaga tcaagcagat cgtgcgcatg 
1201 tggcagggcg tgggccaggc cacctacgcc ccccccatcg ccggcaacat cacctgccgc 
1261 agcaacatca ccggcatcct gctgacccgc gacggcggct tcaacaccac caacaacacc 
13 21 gagaccttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 
13 81 tacaaggtgg tggagatcaa gcccctgggc atcgccccca ccaaggccaa gcgccgcgtg 
1441 gtgcagcgcg agaagcgcgc cgtgggcatc ggcgccgtgt tcctgggctt cctgggcgcc 
1501 gccggcagca ccatgggcgc cgccagcatc accctgaccg tgcaggcccg ccagctgctg 
1561 agcggcatcg tgcagcagca gagcaacctg ctgaaggcca tcgaggccca gcagcacatg 
1621 ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc catcgagcgc 
1681 tacctgaagg accagcagct gctgggcatc tggggctgca gcggccgcct gatctgcacc 
1741 accgccgtgc cctggaacag cagctggagc aacaagagcg agaaggacat ctgggacaac 
1801 atgacctgga tgcagtggga ccgcgagatc agcaactaca ccggcctgat ctacaacctg 
1861 ctggaggaca gccagaacca gcaggagaag aacgagaagg acctgctgga gctggacaag 
1921 tggaacaacc tgtggaactg gttcgacatc agcaactggc cctggtacat caagatcttc 
1981 atcatgatcg tgggcggcct gatcggcctg, cgcatcatct tcgccgtgct gagcatcgtg 
2041 aaccgcgtgc gccagggcta cagccccctg agcttccaga ccctgacccc cagcccccgc 
2101 ggcctggacc gcctgggcgg catcgaggag gagggcggcg agcaggaccg cgaccgcagc 
2161 atccgcctgg tgagcggctt cctgagcctg gcctgggacg acctgcgcaa cctgtgcctg 
2221 ttcagctacc accgcctgcg cgacttcatc ctgatcgccg tgcgcgccgt ggagctgctg 
2281 ggccacagca gcctgcgcgg cctgcagcgc ggctgggaga tcctgaagta cctgggcagc 
2 3 41 ctggtgcagt actggggcct ggagctgaag aagagcgcca tcagcctgct ggacaccatc 
24 01 gccatcaccg tggccgaggg caccgaccgc atcatcgagc tggtgcagcg catctgccgc 
2461 gccatcctga acatcccccg ccgcatccgc cagggcttcg aggccgccct gctgtaactc 
2521 gagcaagtct agagggagac cacaacggtt tccctctagc gggatcaatt ccgccccccc 
2581 ccctaacgtt actggccgaa gccgcttgga ataaggccgg tgtgcgtttg tctatatgtt 
2641 attttccacc atattgccgt cttttggcaa tgtgagggcc cggaaacctg gccctgtctt 
2701 cttgacgagc attcctaggg gtctttcccc tctcgccaaa ggaatgcaag gtctgttgaa 
2761 tgtcgtgaag gaagcagttc ctctggaagc ttcttgaaga caaacaacgt ctgtagcgac 
2821 cctttgcagg cagcggaacc ccccacctgg cgacaggtgc ctctgcggcc aaaagccacg 
2881 tgtataagat acacctgcaa aggcggcaca accccagtgc cacgttgtga gttggatagt 
2941 tgtggaaaga gtcaaatggc tctcctcaag cgtattcaac aaggggctga aggatgccca 
3001 gaaggtaccc cattgtatgg gatctgatct ggggcctcgg tgcacatgct ttacatgtgt 
3061 ttagtcgagg ttaaaaaacg tctaggcccc ccgaaccacg gggacgtggt tttcctttga 
3121 aaaacacgat aataccatgg gcgcccgcgc cagcatcctg cgcggcggca agctggacgc 
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3181 ctgggagcgc atccgcctgc 
3241 gtgggccagc cgcgagctgg 
3301 gggctgcaag cagatcatcc 
33 61 gaagagcctg ttcaacaccg 
3 421 cgacaccaag gaggccctgg 
3 4 81 ccagcaggcc gaggccgccg 
3 541 cctgcagggc cagatggtgc 
3 601 ggtgatcgag gagaaggcct 
3 661 gggcgccacc ccccaggacc 
3721 catgcagatg ctgaaggaca 
3781 cgtgcacgcc ggccccatcg 
3 841 cggcaccacc agcaccctgc 
3901 cgtgggcgac atctacaagc 
3961 cagccccgtg agcatcctgg 
4021 ggaccgcttc ttcaagaccc 
4081 gaccgacacc ctgctggtgc 
4141 cggccccggc gccagcctgg 
4201 ccacaaggcc cgcgtgctgg 
4261 gaagagcaac ttcaagggcc 
4321 ccacatcgcc cgcaactgcc 

43 81 gggccaccag atgaaggact 

44 41 cagccacaag ggccgccccg 
4501 cgccgagagc ttccgcttcg 
4561 cgagaccctg accagcctga 



Figure 33 
(Sheet 2 of 2) 

gccccggcgg caagaagtgc 
agaagttcgc cctgaacccc 
gccagctgca ccccgccctg 
tggccaccct gtactgcgtg 
acaagatcga ggaggagcag 
acaagggcaa ggtgagccag 
accaggccat cagcccccgc 
tcagccccga ggtgatcccc 
tgaacacgat gttgaacacc 
ccatcaacga ggaggccgcc 
cccccggcca gatgcgcgag 
aggagcagat cgcctggatg 
ggtggatcat cctgggcctg 
acatcaagca gggccccaag 
tgcgcgccga gcagagcacc 
agaacgccaa ccccgactgc 
aggagatgat gaccgcctgc 
ccgaggcgat gagccaggcc 
cccggcgcat cgtcaagtgc 
gcgccccccg caagaagggc 
gcaccgagcg ccaggccaac 
gcaacttcct gcagagccgc 
a^gsagaccac ccccggccag 
agagcctgtt cggcaacgac 



tacatgatga agcacctggt 
ggcctgctgg agaccagcga 
cagaccggca gcgaggagct 
cacgagaaga tcgaggtccg 
aacaagtgcc agcagaagat 
aactacccca tcgtgcagaa 
accctgaacg cctgggtgaa 
atgttcaccg ccctgagcga 
gtgggcggcc accaggccgc 
gagtgggacc gcgtgcaccc 
ccccgcggca gcgacatcgc 
accagcaacc cccccatccc 
aacaagatcg tgcggatgta 
gagcccttcc gcgactacgt 
caggaggtga agaactggat 
aagaccatcc tgcgcgctct 
cagggcgtgg gcggccccag 
aacaccagcg tgatgatgca 
ttcaactgcg gcaaggaggg 
tgctggaagt gcggcaagga 
ttcctgggca agatctggcc 
cccgagccca ccgccccccc 
aagcaggaga gcaaggaccg 
cccctgagcc aataa 
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Figure 34 
(Sheet 1 of 1) 

gpieOmod.TVl. tpa2 

1 atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt 
61 tcgcccagca acaccgagga cctgtgggtg accgtgtact acggcgtgcc cgtgtggcgc 
121 gacgccaaga ccaccctgtt ctgcgccagc gacgccaagg cctacgagac cgaggtgcac 
181 aacgtgtggg ccacccacgc ctgcgtgccc accgacccca acccccagga gatcgtgctg 
241 ggcaacgtga ccgagaactt caacatgtgg aagaacgaca tggccgacca gatgcacgag 
3 01 gacgtgatca gcctgtggga ccagagcctg aagccctgcg tgaagctgac ccccctgtgc 
3 61 gtgaccctga actgcaccga caccaacgtg accggcaacc gcaccgtgac cggcaacagc 
421 accaacaaca ccaacggcac cggcatctac aacatcgagg agatgaagaa ctgcagcttc 
481 aacgccacca ccgagctgcg cgacaagaag cacaaggagt acgccctgtt ctaccgcctg 
541 gacatcgtgc ccctgaacga gaacagcgac aacttcacct accgcctgat caactgcaac 
601 accagcacca tcacccaggc ctgccccaag gtgagcttcg accccatccc catccactac 
661 tgcgcccccg ccggctacgc catcctgaag tgcaacaaca agaccttcaa cggcaccggc 
721 ccctgctaca acgtgagcac cgtgcagtgc acccacggca tcaagcccgt ggtgagcacc 
781 cagctgctgc tgaacggcag cctggccgag gagggcatca tcatccgcag cgagaacctg 
841 accgagaaca ccaagaccat catcgtgcac ctgaacgaga gcgtggagat caactgcacc 
901 cgccccaaca acaacacccg caagagcgtg cgcatcggcc ccggccaggc cttctacgcc 
961 accaacgacg tgatcggcaa catccgccag gcccactgca acatcagcac cgaccgctgg 
1021 aacaagaccc tgcagcaggt gatgaagaag ctgggcgagc acttccccaa caagaccatc 
1081 cagttcaagc cccacgccgg cggcgacctg gagatcacca tgcacagctt caactgccgc 
1141 ggcgagttct tctactgcaa caccagcaac ctgttcaaca gcacctacca cagcaacaac 
1201 ggcacctaca agtacaacgg caacagcagc agccccatca ccctgcagtg caagatcaag 
1261 cagatcgtgc gcatgtggca gggcgtgggc caggccacct acgccccccc catcgccggc 
13 21 aacatcacct gccgcagcaa catcaccggc atcctgctga cccgcgacgg cggcttcaac 
1381 accaccaaca acaccgagac cttccgcccc ggcggcggcg acatgcgcga caactggcgc 
1441 agcgagctgt acaagtacaa ggtggtggag atcaagcccc tgggcatcgc ccccaccaag 
1501 gccaagcgcc gcgtggtgca gcgcgagaag cgcgccgtgg gcatcggcgc cgtgttcctg 
15 61 ggcttcctgg gcgccgccgg cagcaccatg ggcgccgcca gcatcaccct gaccgtgcag 
1621 gcccgccagc tgctgagcgg catcgtgcag cagcagagca acctgctgaa ggccatcgag 
1681 gcccagcagc acatgctgca gctgaccgtg tggggcatca agcagctgca ggcccgcgtg 
1741 ctggccatcg agcgctacct gaaggaccag cagctgctgg gcatctgggg ctgcagcggc 
1801 cgcctgatct gcaccaccgc cgtgccctgg aacagcagct ggagcaacaa gagcgagaag 
1861 gacatctggg acaacatgac ctggatgcag tgggaccgcg agatcagcaa ctacaccggc 
1921 ctgatctaca acctgctgga ggacagccag aaccagcagg agaagaacga gaaggacctg 
1981 ctggagctgg acaagtggaa caacctgtgg aactggttcg acatcagcaa ctggccctgg 
2041 tacatcaaga tcttcatcat gatcgtgggc ggcctgatcg gcctgcgcat catcttcgcc 
2101 gtgctgagca tcgtgaaccg cgtgcgccag ggctacagcc ccctgagctt ccagaccctg 
2161 acccccagcc cccgcggcct ggaccgcctg ggcggcatcg aggaggaggg cggcgagcag 
2221 gaccgcgacc gcagcatccg cctggtgagc ggcttcctga gcctggcctg ggacgacctg 

22 81 cgcaacctgt gcctgttcag ctaccaccgc ctgcgcgact tcatcctgat cgccgtgcgc 

23 41 gccgtggagc tgctgggcca cagcagcctg cgcggcctgc agcgcggctg ggagatcctg 
2401 aagtacctgg gcagcctggt gcagtactgg ggcctggagc tgaagaagag cgccatcagc 

24 61 ctgctggaca ccatcgccat caccgtggcc gagggcaccg accgcatcat cgagctggtg 
2521 cagcgcatct gccgcgccat cctgaacatc ccccgccgca tccgccaggg cttcgaggcc 
2581 gccctgctgt aa 
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gpl 6 Omod . TVl -gagmod . BW9 6 5 

1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 

61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 

121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 

181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 

241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 

301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 

361 acccccctgt gcgtgaccct gaactgcacc gacaccaacg tgaccggcaa ccgcaccgtg 

421 accggcaaca gcaccaacaa caccaacggc accggcatct acaacatcga ggagatgaag 

481 aactgcagct tcaacgccac caccgagctg cgcgacaaga agcacaagga gtacgccctg 

541 ttctaccgcc tggacatcgt gcccctgaac gagaacagcg acaacttcac ctaccgcctg 

601 atcaactgca acaccagcac catcacccag gcctgcccca aggtgagctt cgaccccatc 

661 cccatccact actgcgcccc cgccggctac gccatcctga agtgcaacaa caagaccttc 

721 aacggcaccg gcccctgcta caacgtgagc accgtgcagt gcacccacgg catcaagccc 
781 gtggtgagca cccagctgct gctgaacggc agcctggccg aggagggcat catcatccgc 

841 agcgagaacc tgaccgagaa caccaagacc atcatcgtgc acctgaacga gagcgtggag 

901 atcaactgca cccgccccaa caacaacacc cgcaagagcg tgcgcatcgg ccccggccag 

961 gccttctacg ccaccaacga cgtgatcggc aacatccgcc aggcccactg caacatcagc 

1021 accgaccgct ggaacaagac cctgcagcag gtgatgaaga agctgggcga gcacttcccc 

1081 aacaagacca tccagttcaa gccccacgcc ggcggcgacc tggagatcac catgcacagc 
1141 ttcaactgcc gcggcgagtt cttctactgc aacaccagca acctgttcaa cagcacctac 

12 01 cacagcaaca acggcaccta caagtacaac ggcaacagca gcagccccat caccctgcag 

12 61 tgcaagatca agcagatcgt gcgcatgtgg cagggcgtgg gccaggccac ctacgccccc 
1321 cccatcgccg gcaacatcac ctgccgcagc aacatcaccg gcatcctgct gacccgcgac 

13 81 ggcggcttca acaccaccaa caacaccgag accttccgcc ccggcggcgg cgacatgcgc 
1441 gacaactggc gcagcgagct gtacaagtac aaggtggtgg agatcaagcc cctgggcatc 
1501 gcccccacca aggccaagcg ccgcgtggtg cagcgcgaga agcgcgccgt gggcatcggc 
1561 gccgtgttcc tgggcttcct gggcgccgcc ggcagcacca tgggcgccgc cagcatcacc 
1621 ctgaccgtgc aggcccgcca gctgctgagc ggcatcgtgc agcagcagag caacctgctg 
1681 aaggccatcg aggcccagca gcacatgctg cagctgaccg tgtggggcat caagcagctg 
1741 caggcccgcg tgctggccat cgagcgctac ctgaaggacc agcagctgct gggcatctgg 
1801 ggctgcagcg gcCgcctgat ctgcaccacc gccgtgccct ggaacagcag ctggagcaac 
1861 aagagcgaga aggacatctg ggacaacatg acctggatgc agtgggaccg cgagatcagc 
1921 aactacaccg gcctgatcta caacctgctg gaggacagcc agaaccagca ggagaagaac 
1981 gagaaggacc tgctggagct ggacaagtgg aacaacctgt ggaactggtt cgacatcagc 
2041 aactggccct ggtacatcaa gatcttcatc atgatcgtgg gcggcctgat cggcctgcgc 
2101 atcatcttcg ccgtgctgag catcgtgaac cgcgtgcgcc agggctacag ccccctgagc 
2161 ttccagaccc tgacccccag cccccgcggc ctggaccgcc tgggcggcat cgaggaggag 
2221 ggcggcgagc aggaccgcga ccgcagcatc cgcctggtga gcggcttcct gagcctggcc 
2281 tgggacgacc tgcgcaacct gtgcctgttc agctaccacc gcctgcgcga cttcatcctg 
23 41 atcgccgtgc gcgccgtgga gctgctgggc cacagcagcc tgcgcggcct gcagcgcggc 
2401 tgggagatcc tgaagtacct gggcagcctg gtgcagtact ggggcctgga gctgaagaag 
2461 agcgccatca gcctgctgga caccatcgcc atcaccgtgg ccgagggcac cgaccgcatc 
2521 atcgagctgg tgcagcgcat ctgccgcgcc atcctgaaca tcccccgccg catccgccag 
2581 ggcttcgagg ccgccctgct gtaactcgag caagtctaga gggagaccac aacggtttcc 
2641 ctctagcggg atcaattccg cccccccccc taacgttact ggccgaagcc gcttggaata 
2701 aggccggtgt gcgtttgtct atatgttatt ttccaccata ttgccgtctt ttggcaatgt 
2761 gagggcccgg aaacctggcc ctgtcttctt gacgagcatt cctaggggtc tttcccctct 
2821 cgccaaagga atgcaaggtc tgttgaatgt cgtgaaggaa gcagttcctc tggaagcttc 
2881 ttgaagacaa acaacgtctg tagcgaccct ttgcaggcag cggaaccccc cacctggcga 
2941 caggtgcctc tgcggccaaa agccacgtgt ataagataca cctgcaaagg cggcacaacc 
3001 ccagtgccac gttgtgagtt ggatagttgt ggaaagagtc aaatggctct cctcaagcgt 
3061 attcaacaag gggctgaagg atgcccagaa ggtaccccat tgtatgggat ctgatctggg 
3121 gcctcggtgc acatgcttta catgtgttta gtcgaggtta aaaaacgtct aggccccccg 
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3181 aaccacgggg acgtggtttt cctttgaaaa acacgataat accatgggcg cccgcgccag 

3241 catcctgcgc ggcggcaagc tggacgcctg ggagcgcatc cgcctgcgcc ccggcggcaa 

33 01 gaagtgctac atgatgaagc acctggtgtg ggccagccgc gagctggaga agttcgccct 

3 3 61 gaaccccggc ctgctggaga ccagcgaggg ctgcaagcag atcatccgcc agctgcaccc 

3421 cgccctgcag accggcagcg aggagctgaa gagcctgttc aacaccgtgg ccaccctgta 

3481 ctgcgtgcac gagaagatcg aggtccgcga caccaaggag gccctggaca agatcgagga 

3541 ggagcagaac aagtgccagc agaagatcca gcaggccgag gccgccgaca agggcaaggt 

3601 gagccagaac taccccatcg tgcagaacct gcagggccag atggtgcacc aggccatcag 

3661 cccccgcacc ctgaacgcct gggtgaaggt gatcgaggag aaggccttca gccccgaggt 

3721 gatccccatg ttcaccgccc tgagcgaggg cgccaccccc caggacctga acacgatgtt 

3781 gaacaccgtg ggcggccacc aggccgccat gcagatgctg aaggacacca tcaacgagga 

3 841 ggccgccgag tgggaccgcg tgcaccccgt gcacgccggc cccatcgccc ccggccagat 

3901 gcgcgagccc cgcggcagcg acatcgccgg caccaccagc accctgcagg agcagatcgc 

3961 ctggatgacc agcaaccccc ccatccccgt gggcgacatc tacaagcggt ggatcatcct 

4021 gggcctgaac aagatcgtgc ggatgtacag ccccgtgagc atcctggaca tcaagcaggg 

4081 ccccaaggag cccttccgcg actacgtgga ccgcttcttc aagaccctgc gcgccgagca 

4141 gagcacccag gaggtgaaga actggatgac cgacaccctg ctggtgcaga acgccaaccc 

4201 cgactgcaag accatcctgc gcgctctcgg ccccggcgcc agcctggagg agatgatgac 

4261 cgcctgccag ggcgtgggcg gccccagcca caaggcccgc gtgctggccg aggcgatgag 

43 21 ccaggccaac accagcgtga tgatgcagaa gagcaacttc aagggccccc ggcgcatcgt 

43 81 caagtgcttc aactgcggca aggagggcca catcgcccgc aactgccgcg ccccccgcaa 

4441 gaagggctgc tggaagtgcg gcaaggaggg ccaccagatg aaggactgca ccgagcgcca 

4501 ggccaacttc ctgggcaaga tctggcccag ccacaagggc cgccccggca acttcctgca 

4561 gagccgcccc gagcccaccg ccccccccgc cgagagcttc cgcttcgagg agaccacccc 
4621 cggccagaag caggagagca aggaccgcga gaccctgacc agcctgaaga gcctgttcgg 

4681 caacgacccc ctgagccaat aa 
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intoptmuLC (South Africa TVl) 

TTCCTGGACGGCA.TCGACAAGGCCCAGGAGGAGCACGAGCGCTACCACAGCAACTGGCGCGCCATGGCC 
AACGAGTTCAACCTGCCCCCCATCGTGGCCAAGGAGATCGTGGCCAGCGCCGACAAGTGCCAGCTGAAG 
GGCGAGGCCATCCACGGCCAGGTGGACTGCAGCCCCGGCATCTGGCAGCTGGCCTGCACCCACCTGGAG 
GGCAAGATCATCCTGGTGGCCGTGCACGTGGCCAGCGGCTACATGGAGGCCGAGGTGATCCCCGCCGAG 

accggccaggagaccgcctacttcatcctgaagctggccggccgctggcccgtgaaggtgatccacacc 
gccaacggcagcaacttcaccagcaccgccgtgaaggccgcctgctggtgggccggcatccagcaggag 
ttcggcatcccctacaacccccagagccXgggcgtggtggcgagcatgaacaaggagctgaagaagatc 
atcggccaggtgcgcgaccaggccgagcacctgaagaccgccgtgcagatggccgtgttcatccacaac 
ttcaagcgcaagggcggcatcggcggctacagcgccggcgagcgcatcatcgacatcatcgccaccgac 
atccagaccaaggagctgcagaagcagatcatccgcatccagaacttccgcgtgtactaccgcgacagc 
cgcgaccccatcaagggccccgccgagctgctgtggaagggcgagggcgtggtggtgatcgaggacaag 
ggcgacatcaaggtggtgccccgccgcaaggccaagatcatccgcgactacggcaagcagatggccggc 
gccgactgcgtggccggcggccaggacgaggac 
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int.opLC (South Africa TVl) 

TTCCTGGACGGCATCGACAAGGCCCAGGAGGAGCACGAGCGCTACCACAGCAACTGGCGCGCCATGGCC 
AACGAGTTCAACCTGCCCCCCATCGTGGCCAAGGAGATCGTGGCCAGCTGCGACAAGTGCCAGCTGAAG 
GGCGAGGCCATCCACGGCCAGGTGGACTGCAGCCCCGGCATCTGGCAGCTGGACTGCACCCACCTGGAG 
GGCAAGATCATCCTGGTGGCCGTGCACGTGGCCAGCGGCTACATGGAGGCCGAGGTGATCCCCGCCGAG 
ACCGGCCAGGAGACCGCCTACTTCATCCTGAAGCTGGCCGGCCGCTGGCCCGTGAAGGTGATCCACACC 
GACAACGGCAGCAACTTCACCAGCACCGCCGTGAAGGCCGCCTGCTGGTGGGCCGGCATCCAGCAGGAG 
TTCGGCATCCCCTACAACCCCCAGAGCCAGGGCGTGGTGGAGAGCATGAACAAGGAGCTGAAGAAGATC 
ATCGGCCAGGTGCGCGACCAGGCCGAGCACCTGAAGACCGCCGTGCAGATGGCCGTGTTCATCCACAAC 
TTCAAGCGCAAGGGCGGCATCGGCGGCTACAGCGCCGGCGAGCGCATCATCGACATCATCGCCACCGAC 
ATCCAGACCAAGGAGCTGCAGAAGCAGATCATCCGCATCCAGAACTTCCGCGTGTACTACCGCGACAGC 
CGCGACCCCATCTGGAAGGGCCCCGCCGAGCTGCTGTGGAAGGGCGAGGGCGTGGTGGTGATCGAGGAC 
AAGGGCGACATCAAGGTGGTGCCCCGCCGCAAGGCCAAGATCATCCGCGACTACGGCAAGCAGATGGCC 
GGCGCCGACTGCGTGGCCGGCGGCCAGGACGAGGAC 
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nef.D106G.-myrl9.opLC (dbl.mutant) 

ATGATCCGCCGCACCGAGCCCGCCGCCGAGGGCGTGGGCGCCGCCAGCCAGGACCTGGACAAGCACGGC 
GCCCTGACCAGCAGCAACACCGCCGCCAACAACGCCGACTGCGCCTGGCT6GAGGCCCAGGAGGAGGAG 
GAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAGGCCGCCTTCGAC 
CTGAGCTTCTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACAGCAAGAAGCGCCAGGAGATC 
CTGGACCTGTGGGTGTACCACACCCAGGGCTTCTTCCCCGGCTGGCAGAACTACACCCCCGGCCCCGGC 
GTGCGCTACCCCCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCCGCGAGGTGGAGGAG 
GCCAACAAGGGCGAGAACAACTGCCTGCTGCACCCCATGAGCCAGCACGGCATGGAGGACGAGGACCGC 
GAGGTGCTGAAGTGGAAGTTCGACAGCAGCCTGGCCCGCCGCCACATGGCCCGCGAGCTGCACCCCGAG 
TACTACAAGGACTGCGCC 
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TACGTGGACGGCGCCGCCAACCGCGAGACCAAGATCGGCAAGGCCGGCTACGTGACCGA 

CCGGGGCCGGCAGAAGATCGTGAGCCTGACCOAGACCACCAACCAGAAGACCGAGCTGC 

AGGCCATCCAGCTGGCCCTGCAGGACAGCGGCAGCGAGGTGAACATCGTGACCGACAGC 

CAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAGCGAGAGCGAGCTGGTGAA 

CCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCA 

CAAGGGCATCGGCGGCAACGAGCAGATCGACAAGCTGGTGAGCAAGGGCATCCGCAAGG 

TGCTC 
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GCCACCATGGCCGAGGCCATGAGCCAGGCCACCAGCGCCAACATCCTGATGCAGCGCAGCAACTTCAAG 
GGCCCCAAGCGCATCATCAAGTGCTTCAACTGCGGCAAGGAGGGCCACATC6CCCGCAACT6CCGCGCC 
CCCCGCAAGAAGGGCTGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAG6ACTGCACCGAGCGCCAG 
GCCAACTTCTTCCGCGAGGACCTGGCCTTCCCCCAGGGCAAQGCCCGCGAGTTCCCCAGCQAGCAGAAC 
CGCGCCAACAGCCCCACCAGCCGCGAGCTGCAGQTGCGCGGCGACAACCCCCGCAGCGAGGCCGGCGCC 
GAGCGCCAGGGCACCCTOAACTTCCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGAGCATCAAGGTQ 
GGCGGCCAGATCAAGGAGGCCCTGCTGGCCACCGGCGCCGACGACACCGTGCTGGAGGAGAT6AGCCTG 
CCCGGCAAGTGGAAGCCCAAGATGATC6GCGGCATCGGCGGCTTCATCAAGGTGCGCCAGTACGACCAG 
ATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGATCGGCCCCACCCCCGTGAACATC 
ATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACCGTG 
CCCGTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATC 
AAGGCCCTGACCGCCATCTGCGAGGAGATGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAAC 
CCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTGGACTTC 
CGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTG 
AAGAAGAAGAAGAGCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACGAGGAC 
TTCCGCAAGTACACCGCCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTAC 
AACGTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATCCTGGAG 
CCCTTCCGCGCCCGCAACCCCGAGATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAGCGACCTGGAG 
ATCGGCCAGCACCGCGCCAAGATCGAGGAGCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACCACCCCC 
GACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGCCCATCGAGCTGCACCCCGACAAGT6GACCGTGCAG 
CCCATCGAGCTGCCCGAGAAGGAGAGCTGGACCGTGAACGACATCCAGAAGCTGGTCGGCAAGCTGAAC 
TGGGCCAGCCAGATCTACCCC6GCATCAAGGTGCGCCAGCTGTGCAAGCTGCT6CGCGGCGCCAAGGCC 
CTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGCGC 
GAGCCCGTGCACGGCGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAAGCAGGGCCAC 
GACCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAA6AACCT6AA6ACCG6CAAGTACGCCAAGATG 
CGCACCGCCCACACCAACGACGT6AA6CA6CTGACCGAG6CCGTGCAGAA6ATCGCCATGGAGAGCATC 
GT6ATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCATCCA6AAGGAGACCTGGGAGACCTGGTGGACC 
GACTACTGGCAGGCCACCT6GATCCCCGA6TGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGG 
TACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAG 
ACCAAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGATCGTGAGCCTGACCGAGACC 
ACCAACCAGAAGACCGAGCTGCAGGCCATCCAGCTGGCCCTGCAGGACAGCGGCAGCGAGGTGAACATC 
GTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAGCGAGAGCGAGCTGGTG 
AACCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCACAAGGGC 
ATCGGCGGCAACGAGCAGATCGACAAGCTGGTGAGCAAGGGCATCCGCAAGGTGCTGTTCCTGGACGGC 
ATCGATGGCGGCATCGTGATCTACCAGTACATGGACGACCTGTAC6TGGGCAGCGGCGGCCCTAGGATC 
GATTAAAAGCTTCCCGGGGCTAGCACCGGT 
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GTCGACGCCACCATGGCCGAGGCCATGAGCCAGGCCACCAGCGCCAACATCCTGATGCAGCGCAGCAAC 

TTCAAGGGCCCCAAGCGCATCATCAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGC 

CGCGCCCCCCGCAAGAAGGGCT6CTGGAAGTGC6GCAAGGAGGGCCACCAGATGAAGGACTGCACCGAG 

CGCCAGGCCAACTTCTTCCGCGAGGACCTGGCCTTCCCCCAGGGCAAGGCCCGCGA6TTCCCCAGCGAG 

CAGAACCGCGCCAACAGCCCCACCAGCCGCGAGCTGCAGGTGCGCGGCGACTJ^CCCCCGCAGCGAGG^ 

GGCGCCGAGCGCCAGGGCACCCTGAACTTCCCCCAGATCACCCT6T66CAGC6CCCCCTG6TGAGCATC 

AAGGTGGGCGGCCAGATCAAGGAGGCCCTGCTGGCCACCGGCGCCGAC6ACACCGTGCTGGAGGAGATG 

AGCCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCAAGGTGCGCCAGTAC 

GACCAGATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGATCGGCCCCACCCCCGTG 

AACATCATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAG 

ACCGTGCCCGTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAG 

AAGATCAAGGCCCTGACCGCCATCTGCGAGGAGATGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCC 

GAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTG 

GACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCC 

GGCCTGAAGAAGAAGAAGAGCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGAC 

GAGGACTTCCGCAAGTACACCGCCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTAC 

CAGTACAACGTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATC 

CTGGAGCCCTTCCGCGCCCGCAACCCCGAGATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAGCGAC 

CTGGAGATCGGCCAGCACCGCGCCAAGATCGAGGAGCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACC 

ACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGTGGATGGGCTACGAGCTGCACCCCGAC^^ 

T6GACCGTGCA6CCCATCGAGCTGCCCGAGAAGGAGA6CTGGACCGTGAAC6ACATCCAQAAGCTGGTG 

GGCAAGCTGAACTGG6CCAGCCAGATCTACCCCGGCATCAAGGTGCGCCAGCTGTGCAAGCTGCTGCGC 

GGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGC 

6AGATCCTGCGCGAGCCCGTGCACGGCGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAG 

AAGCAGGGCCACGACCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAG 

TACGCCAAGATGCGCACCGCCCACACCAACGACGTGAAGCAGCTGACCGAGGCCGT6CAGAAGATCGCC 

ATGGAGAG^TCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCATCCAGAAGGAGACC^ 

ACCTGGTGGACCGACTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTG 

GTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGCC 

GCCAACCGCGAGACCAAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGATCGTGAGC 

CTGACCGAGACCACCAACCAGAAGACCGAGCTGCAGGCCATCCAGCTGGCCCTGCAGGACAGCGGCAGC 

GAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAGCGAG 

AGCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCC 

GCCCACAAGGGCATCGGCGGCAACGAGCAGATCGACAAGCTGGTGAGCAAGGGCATCC6CAAGGTGCTG 

TTCCTGGACGGCATCGATGGCGGCATCGTGATCTACCAGTACATGGAC6ACCTGTACGTGGGCAGCGGC 

GGCCCTAGGATCGATTAAAAGCTTCCCGGGGCTAGCACCGGT 
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GCCACCATGGCCGAGGCCATGAGCCAGGCaVCCAGCGCCAACATCCTGATGCAGCGCAGCAACTTCAAG 

GGCCCCAAGCGCATCATCAAGTGCTTCy^CTGCGGCAAGGAGGGCCACATCGCCCGCT^CTGCC^ 

CCCCGCAAGAAGGGCTGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAG 

GCCAACTTCTTCCGCGAGGACCTGGCCTTCCCCCAGGGCAAGGCCCGCGAGTTCCCCAGC6AGCAGAAC 

CGCGCCAACAGCCCCACCA6CC6CGAGCTGCAGGTGC6CGGCGACAACCCCCGCAGCGAGGCCGGCGCC 

GAGCGCCAGGGCACCCTGAACTTCCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGAGCATCAAGGTG 

GGCGGCCAGATCAAGGAGGCCCTGCTGGACACCGGCGCCQACGACACCGTGCTGGAGGAGATGAGCCTG 

CCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCA^GGTGCGCCAGTACGACCAG 

ATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGATCGGCCCCACCCCCGTGAACATC 

ATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACCGTG 

CCCGTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATC 

AAGGCCCTGACCGCCATCTGCGAGGAGATG6AGAAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAAC 

CCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTGGACTTC 

CGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTG 

AAGAAGAAGAAGAGCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACGAGGAC 

TTCCGCAAGTACACCGCCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTAC 

AACGTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATCCTGGAG 

CCCTTCCGCGCCCGCAACCCCGAGATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCAGCGAC 

CTGGAGATCGGCCAGCACCGCGCCAAGATCGAGGAGCTGCGCAAGCACCTGCTGCGCTG6GGCTTCACC 

ACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGTGGATGGGCTACGAGCTGCACCCCGACAAG 

TGGACCGTGCAGCCCATCGAGCTGCCCGAGAAGGAGAGCTGGACCGTGAACGACATCCAGAAGCTGGTG 

GGCAAGCTGAACTGGGCCAGCCAGATCTACCCCGGCATCAA6GTGCGCCAGCTGTQCAAGCTGCTGCGC 

GGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTG6AGCTGGCCGAGAACCGC 

GAGATCCTGCGCGAGCCCQTGCACGGCGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAG 

AAGCAGGGCCACGACCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAG 

TACGCCAAGATGCGCACCGCCCACACCAAC6ACGTGAAGCAGCTGACCGAGGCCGTGCAGAAGATCGCC 

ATGGAGAGCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCATCCAGAAGGAGACCTGTC 

ACCTGGTGGACCGACTACTGGCAGGCCACCTGGATCCCCGA6TGGGAGTTCGTGAACACCCCCCCCCTG 

GTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGCC 

GCCAACCGCGAGACCAAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGATCGTGAGC 

CTGACCGAGACCACCAACCAGAAGACCGAGCTGCAGGCCATCCAGCTGGCCCTGCAGGACAGCGGCAGC 

GAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAGCGAG 

AGCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCC 

GCCCACAAGGGCATCGGCGGCAACGAGCAGATCGACAAGCTGGTGAGCAAGGGCATCCGCAAGGTGCTG 

TTCCTGGACGGCATCGATGGCGGCATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCAGCGGC 

GGCCCTAGGATCGATTAAAAGCTTCCCGGGGCTAGCACCGGT 
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GTCGACGCCACCATGGAGCCCGTGGACCCCAACCTGGAGCCCTGGAACCACCCCGGCAGCCAGCCCAAG 
ACCGCCGGCAACAAGTGCTACTGCAAGCACTGCAGCTACCACTGCCTGGTGAGCTTCCAGACCAAGGGC 
CTGGGCATCAGCTACGGCCGCAAGAAGCGCCGCCAGCGCCGCAGCGCCCCCCCCAGCAGCGAGGACCAC 
CAGAACCCCATCAGCAAGCAGCCCCTGCCCCAGACCCGCGGCGACCCCACCGGCAGCGAGGAGAGCAAG 
AAGAAGGTGGAGAGCAAGACCGAGACCGACCCCTTCGACCCCGGGGCCGGCCGCAGCGGCGACAGCGAC 
GAGGCCCTGCTGCAGGCCGTGCGCATCATCAAGATCCTGTACCAGAGCAACCCCTACCCCAAGCCCGAG 
GGCACCCGCCAGGCCGACCTGAACCGCCGCCGCCGCTGGCGCGCCCGCCAGCGCCAGATCCACAGCATC 
AGCGAGCGCATCCTGAGCACCTGCCTGGGCCGGCCCGCCGAGCCCGTGCCCTTCCAGCTGCCCCCCGAC 
CTGCGCCTGCACATCGACTGCAGCGAGAGCAGCGGCACCAGCGGCACCCAGCAGAGCCAGGGCACCACC 
GAGGGCGTGGGCAGCCCCCTCGAGGCCGGCAAGTGGAGCAAGAGCAGCATCGTGGGCTGGCCCGCCGTG 
CGCGAGCGCATCCGCCGCACCGAGCCCGCCGCCGAGGGCGTGGGCGCCGCCAGCCAGGACCTGGACAAG 
CACGGCGCCCTGACCAGCAGCAACACCGCCGCCAACAACGCCGACTGCGCCTGGCTGGAGGCCCAGGAG 
GAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAGGCCGCC 
TTCGACCTGAGCTTCTTCCTGAAG6A6AAGGGCGGCCTG6AGGGCCTGATCTACAGCAAGAAGCGCCAG 
GAGATCCTGGACCTGTGGGTGTACCACACCCAGGGCTTCTTCCCCGGCTG6CAGAACTACACCCCCGGC 
CCCGGCGTGCGCTACCCCCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCCGCGAGGTG 
GAGGAGGCCAACAAGGGCGAGAACAACTGCCTGCTGCACCCCATGAGCCAGCACGGCATGGAGGACGAG 
GACCGCGAGGTGCTGAAGTGGAAGTTCGACAGCAGCCTGGCCCGCCGCCACATGGCCCGCGAGCTGCAC 
CCCGAGTACTACAAGGACTGCGAATTCGCCGAGGCCATGAGCCAGGCCACCAGCGCCAACATCCTGATG 
CAGCGCAGCAACTTCAAGGGCCCCAAGCGCATCATCAAGTGCTTCAACTGCGGCAAGGAGGGCCACATC 
GCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAG 
GACTGCACCGAGCGCCAGGCCAACTTCTTCCGCGAGGACCTGGCCTTCCCCCAGGGCAAGGCCCGCGAG 
TTCCCCAGCGAGCAGAACCGCGCCAACAGCCCCACCAGCCGCGAGCTGCAGGTGCGCGGCGACAACCCC 
CGCAGCGAGGCCGGCGCCGAGCGCCAGGGCACCCTGAACTTCCCCCAGATCACCCTGTGGCAGCGCCCC 
CTGGTGAGCATCAAGGTGGGCGGCCAGATCAAGGAGGCCCTGCTGGCCACCGGCGCCGACGACACCGTG 
CTGGAGGAGATGAGCCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCAAG 
^ GTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGATCGGC 
CCCACCCCCGTGAACATCATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTCCCCATC 
AGCCCCATCGAGACCGTGCCCGTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCC 
CTGACCGAGGAGAAGATCAAGGCCCTGACCGCCATCTGCGAGGAGATGGAGAAGGAGGGCAAGATCACC 
AAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCACCAAGTGG 
CGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAGCTGG6CATC 
CCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCAGC 
GTGCCCCTGGACGAGGACTTCCGCAAGTACACCGCCTTCACCATCCCCAGCATCAACAACGAGACCCCC 
GGCATCCGCTACCAGTACAACGTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATCTTCCAGAGCAGC 
ATGACCAAGATCCTGGAGCCCTTCCGCGCCCGCAACCCCGAGATCGTGATCTACCAGGCCCCCCTGTAC 
GTGGGCAGCGACCTGGAGATCGGCCAGCACCGCGCCAAGATCGAGGAGCTGCGCAAGCACCTGCTGCGC 
TGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGCCCATC6AGCTGCACCCC 
GACAAGTGGACCGTGCAGCCCATCGAGCTGCCCGAGAAGGAGAGCTGGACCGTGAACGACATCCAGAAG 
CTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTACCCCGGCATCAAGGTGCGCCAGCTGTGCAAGCTG 
CTGCGCGGCGCCAAG6CCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGA6CTGGAGCTGGCCGAG 
AACCGCGAGATCCTGCGCGAGCCCGTGCACGGCGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAG 
ATCCAGAAGCAGGGCCACGACCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACC 
GGCAAGTACGCCAAGATGCGCACCGCCCACACCAACGACGTGAAGCAGCTGACCGAGGCCGTGCAGAAG 
ATCGCCATGGAGAGCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCATCCAGAAGGAGACC 
TGGGAGACCTGGTGGACCGACTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCC 
CCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGAC 
GGCGCCGCCAACCGCGAGACCAAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGATC 
GTGAGCCTGACCGAGACCACCAACCAGAAGACCGAGCTGCAGGCCATCCAGCTGGCCCTGCAGGACAGC 
GGCAGCGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAG 
AGCGAGAGCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACCTGAGCTGG 
GTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGATCGACAA6CTGGTGAGCAAGGGCATCC6CAAG 
GTGCTGTAA 
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GCCACCATGGCCGAGGCCATGAGCCAGGCCACCAGCGCCAACATCCTGATGCAGCGCAGCAACTTCAAG 

GGCCCCAAGCGCATCATCAAGTGCTTCTUVCTGCGGCAAGGAGGGCraCATCGCCCGCAACTGCCGCGCC 

CCCCGCAAGAAGGGCTGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGA6CGCCAG 

GCCAACTTCTTCCGCGAGGACCTGGCCTTCCCCCAGGGCAAGGCCCGCGAGTTCCC 

CGCGCCAACAGCCCCACCAGCCGC6AGCTGCAGGTGCGCGGCGAC7ACCCCCGCAGCGAGGCCGGCGCC 

GAGCGCCAGGGCACCCTGAACOTCCCCCAGATCACCCTGT6GCAGCGCCCCCTGGTGAGCATC 

GGCGGCCAGATCAAGGAGGCCCTGCTGGACACCGGC6CCGAC6ACACCGTGCTG6AGGAGATGAGCCTG 

CCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCAAGGTGCGCCAGTACGACCAG 

ATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGATCGGCCCCACCCCCGTGAACATC 

ATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACCGTG 

CCCGTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATC 

AAGGCCCTGACCGCCATCTGCGAGGAGATGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAAC 

CCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTGGACTTC 

CGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTG 

AAGAAGAAGAAGAGCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACGAGGAC 

TTCCGCAAGTACACCGCCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTAC 

AACGTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATCCTGGAG 

CCCTTCCGCGCCCGCAACCCCGAGATCGTGATCTACCAGTACATGGACGACCTGTACGTGGGCAGCGAC 

CTGGAGATCGGCCAGCACCGCGCCAAGATCGAGGAGCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACC 

ACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGTGGATGGGCTACGAGCTGCACCCCGACAAG 

TGGACCGTGCAGCCCATCGAGCTGCCCGAGAAGGAGAGCTGGACCGTGAACGACATCCAGAAGCTGGTG 

GGCAAGCTGAACTGGGCCAGCCAGATCTACCCCGGCATCyAGGTGCGCCAGCTGTGCAAGCT^ 

GGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGC 

GAGATCCTGCGCGAGCCCGTGCACGGCGTGTACTACGACCCCy^GCy^CMACCTGGTGGCCGAGATCC^ 

AAGCAGGGCCACGACCAGTGGACCTACCAGATCTACCyiGGAGCCCTTCAAGAACCTGAAGACCGGCAAG 

TACGCCAAGATGCGCACCGCCCACACCAACGAC6TGAAGCAGCTGACCGAGGCCGTGCAGAAGATCGCC 

ATGGAGAGCATC6TGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCATCCAGAAGGAGACCTGGGAG 

ACCTGGTGGACCGACTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTG 

GTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGCC 

GCCAACCGCGAGACCAAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGATCGTGAGC 

CTGACCGAGACCACCAACCAGAAGACCGAGCTGCAGGCCATCCAGCTGGCCCTGCAGGACAGCGGCAGC 

GAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAGCGAG 

AGCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCC 

GCCCACAAGGGCATCGGCGGCAACGAGCAGATCGACAAGCTGGTGAGCAAGGGCATCCGCAAGGTGCTG 

GAATTCGAGCCCGTGGACCCCAACCTGGAGCCCTGGAACCACCCCGGCAGCCAGCCCAAGACCGCCTGC 

AACAAGTGCTACTGCAAGCACTGCAGCTACCACTGCCTGGTGTGCTTCCAGACCAAGGGCCTGGGCATC 

AGCTACGGCCGCAAGAAGCGCCGCCAGCGCCGCAGCGCCCCCCCCAGCAGCGAGGACCACCAGAACCCC 

ATCAGCAAGCAGCCCCTGCCCCAGACCCGCGGCGACCCCACCGGCAGCGAGGAGAGCAAGAAGAAGGTQ 

GAGAGCAAGACCGAGACCGACCCCTTCGACCCCGGGGCCGGCCGCAGCGGCGACAGCGACGAG6CCCTG 

CTGCAGGCCGTGCGCATCATCAAGATCCTGTACCAGAGCAACCCCTACCCCAAGCCCGAGGGCACCCGC 

CAGGCCCGCAAGAACCGCCGCCGCCGCTGGCGCGCCCGCCAGCGCCAGATCCACAGCATCAGCGAGCGC 

ATCCTGAGCACCTGCCTGGGCCGCCCCGCCGAGCCCGTGCCCTTCCAGCTGCCCCCCATCGAGCGCCTG 

CACATCGACTGCAGCGAGAGCAGCGGCACCAGCGGCACCCAGCA6AGCCAGGGCACCACCGAGGGCGTG 

GGCAGCCCCCTCGAGGGCGGCAAGTGGA6CAAGAGCAGCATCGTGGGCTGGCCCGCCGTGCGCGAGCGC 

ATCCGCCGCACCGAGCCCGCCCGCGAGGGCGCCGCCGAGGGCGCCGCCGAGGGCGTGGGCGCCGCCAGC 

CAGGACCTGGACAAGCACGGCGCCCTGACCAGCAGCAACACCGCCGCCAACAACGCCGACTGCGCCTGG 

CTGGAGGCCCAGGAGGAGGAGGAGGAGGTGGGCTTCCCCGTGC6CCCCCAGGTGCCCCTGCGCCCCATG 

ACCTACAAGGCCGCCTTCGACCTGAGCTTCTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTAC 

AGCAAGAAGCGCCAGGAGATCCTGGACCTGTGGGTGTACCACACCCAGGGCTTCTTCCCCGACTGGCAG 

AACTACACCCCCGGCCCCGGCGTGCGCTACCCCCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTG 

GACCCCCGCGAGGTGGAGGAGGCCAACAAGGGCGAGAACAACTGCCTGCTGCACCCCATGAGCCAGCAC 

GGCATGGAGGACOAGGACCGCGAGGTGCTGAAGTGGAAOTTCQACAGCAGCCTGGCCCGCCGCCACATQ 

GCCCGCGAGCTGCACCCCGAGTACTACAAGGACTGC 
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GTCGACGCCACCATGGCCGAGGCCATGAGCCAGGCCACCAGCGCCAACATCCTGATGCAGCGCAGCAAC 
TTCAAGGGCCCCAAGCGCATCATCAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACT^ 
CGCGCCCCCCGCAAGAAGGGCTGCTGGAAGTGCGGCAAGGAGGGCCACCA6ATGAAGGACTGCACCGAG 
CGCCAGGCCAACTTCTTCCGCGAGGACCTGGCCTTCCCCCAGGGCAAGGCCCGCGAGTTCCCCAGCGAG 
CAGAACCGCGCCAACAGCCCCACCA6CCGCGAGCTGCAGGTGCGCGGCGACAACCCCCGCAGCGAGGCC 
GGCGCCGAGCGCCAGGGCACCCTGAACTTCCCCCAGATCACCCTGTGGCA6CGCCCCCTGGTGAGCATC 
AAGGTGGGCGGCCAGATCAAGGAGGCCCTGCTGGCCACC6GC6CCGACGACACCGTGCTGGAGGA6ATG 
AGCCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCAAGGTGCGCCAGTAC 
GACCAGATCCTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGATCGGCCCCACCCCCGTG 
AACATCATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAG 
ACCGTGCCCGTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAG 
AAGATCAAGGCCCTGACCGCCATCTGCGAGGAGATGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCC 
GAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTG 
GACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCC 
GGCCTGAAGAAGAAGAAGAGCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGAC 
GAGGACTTCCGCAAGTACACCGCCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGC7VTCCGCTAC 
CAGTACAACGTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATC 
CTGGAGCCCTTCCGCGCCCGCAACCCCGAGATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAGCGAC 
CTGGAGATCGGCCAGCACCGCGCCAAGATCGAGGAGCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACC 
ACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGCCCATCGAGCTGCACCCCGACAAGTGGACC 
GTGCAGCCCATGGAGCTGCCCGAGAAGGAGAGCTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAG 
CTGAACTGGGCCAGCCAGATCTACCCCGGCATCAAGGTGCGCCAGCTGTGCAAGCTGCTGCGCGGCQCC 
AAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATC 
CTGCGCGAGCCCGTGCACGGCGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAAGCAG 
GGCCACGACCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCC 
AAGATGCGCACCGCCCACACCAACGAC6TGAAGCAGCTGACCGAGGCCGTGCAGAAGATCGCCATGGAG 
AGCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCATCCAGAAGGAGACCTGGGA6ACCTGG 
TGGACCGACTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAG 
CTGTGGTACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAAC 
CGCGAGACCAAGATCGGCAAGGCCGGCTACGTGACC6ACCGGGGCCGGCAGAAGATCGTGAGCCTGACC 
GAGACCACCAACCAGAAGACCGAGCTGCAGGCCATCCAGCTGGCCCTGCAGGACAGCGGCAGCGAGGTG 
AACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAGCGAGAGCGAG 
CTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCAC 
AAGGGCATCGGCGGCAACGAGCAGATCGACAAGCTGGTGAGCAAGGGCATCCGCAAGGTGCTGGAATTC 
GAGCCCGTGGACCCCAACCTGGAGCCCTGGAACCACCCCGGCAGCCAGCCCAAGACCGCCGGCAACAAG 
TGCTACTGCAAGCACTGCAGCTACCACTGCCTGGTGAGCTTCCAGACCAAGGGCCTGGGCATCAGCTAC 
GGCCGCAAGAAGCGCCGCCAGCGCCGCAGCGCCCCCCCCAGCAGCGAGGACCACCAGAACCCCATCAGC 
AAGCAGCCCCTGCCCCAGACCCGCGGCGACCCCACCGGCAGCGAGGAGAGCAAGAAGAAGGTGGAGAGC 
AAGACCGAGACCGACCCCTTCGACCCCGGGGCCGGCCGCAGCGGCGACAGCGACGAGGCCCTGCTGCAG 
GCCGTGCGCATCATCAAGATCCTGTACCAGAGCAACCCCTACCCCAAGCCCGAGGGCACCCGCCAGGCC 
GACCTGAACCGCCGCCGCCGCTGGCGCGCCCGCCAGCGCCAGATCCACAGCATCAGCGAGCGCATCCTG 
AGCACCTGCCTGGGCCGCCCCGCCGAGCCCGTGCCCTTCCAGCTGCCCCCCGACCTGCGCCTGCACATC 
GACTGCAGCGAGAGCAGCG6CACCA6CGGCACCCAGCAGAGCCAGGGCACCACC6AGGGCGTGGGCAGC 
CCCCTCGAGGCCGGCAA6T6GAGCAAGAGCAGCATCGT6GGCTGGCCC6CCGT6CGCGAGCGCATCCGC 
CGCACCGA6CCCGCCGCCGAGGGCGTGGGCGCCGCCAGCCAGGACCTGGACAAGCACGGCGCCCTGACC 
AGCAGCAACACCGCCGCCAACAACGCCGACTGCGCCTGGCTGGAGGCCCAGGAG6AGGAGGAGGAGGTG 
GGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCAT6ACCTACAAGGCCGCCTTCGACCTGAGCTTC 
TTCCTGAAGGAGAAGGGCGGCCT66AGGGCCTGATCTACAGCAAGAAGCGCCAGGAGATCCTGGACCTG 
TGGGTGTACCACACCCAGGGCTTCTTCCCCGGCTGGCA6AACTACACCCCCGGCCCCGGCGTGCGCTAC 
CCCCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCCGCGAGGTGGAGGAGGCCT^CAAG 
GGCGAGAACAACTGCCTGCTGCACCCCATGAGCCAGCACGGCATGGAGGACGAGGACCGCGAGGTGCTG 
AAGTGGAAGTTCGACAGCAGCCTGGCCCGCCGCCACATGGCCCGCGAGCT6CACCCCGAGTACTACAAG 
GACTGCGCCTAAATCTAGA 
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CCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGAGCATCAAGGTGGGCGGCCAGATCAAGGAGGCCCTG 

CTGGCCACCGGCGCCGACGACACCGTGCTGGAGGAGATGAGCCTGCCCGGCAAGTGGAAGCCCAAGATG 

ATCGGCGGCATCGGCGGCTTCATCAAGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGG^ 

AAGGCCATCGGCACCGTGCTGATCGGCCCCACCCCCGTGAACATCATCGGCCGCAACATGCTGM 

CTGGGCTGCACCCTGAACTTCCCCATOVGCCCCATCGAQACCGTGCCCGTGAAGCTGAAGCCCGC^^ 

GACGGCCCCAAGGTGAAGCAGTGGCCCCOKSACCGAGGAQAAGATCAAGGCCCTGACCGC^ 

GAGATGGAGAAGGAGGGCTLAGATO^CCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCC 

ATCAAGAAGAAGGACAGCACCJUVGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAG 

GACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTG 

CTGGACGTGGGCGACGCCTACTTCAGCGT6CCCCTGGACGAGGACTTCCGCAAGTACACCGCCTTCACC 

ATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGCTGCCCCAGGGCTGGAAG 

GGCAGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATCCTGGAGCCCTTCCGCGCCCGCAACCCCGAG 

ATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAGCGACCTGGAGATCGGCCAGCACCGCGCCAAGATC 

GAGGAGCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCC 

CCCTTCCTGTGGATGGGCTACGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCGAGCTGCCCGAG 

AAGGAGAGCTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTAC 

CCCGGCATCAAGGTGCGCCAGCTGTGCAAGCTGCTGCGCGGCGCCAAGGCCCTGACCGACATCGTGCCC 

CTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGCGCGAGCCCGTGCACGGCGTG 

TACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAAGCAGGGCCACGACCAGTGGACCTACCAG 

ATCTACCAGGAGCCCTTCAAGAACCTQAAGACCGGCAAGTACGCCAAGATGCGCACCGCCCACACCAAC 

GAC6TGAAGCAGCTGACCGAGGCCGTGCA6AAGATCGCCATG6AGAGCATCGTGATCTGGGGCAAGACC 

CCCAA6TTCCGCCTGCCCATCCAGAAGGAGACCTGGGAGACCTGGTGGACCGACTACTGGCAGGCCACC 

TGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGC^^ 

CCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCaVACCGCGAGACCAAGATCGGCAAGGCC 

GGCTACGTGACCGACC6GGGCCGGCAGAAGATCGTGAGCCTGACCGAGACCACCAACCAGAAGACCGAG 

CTGCAGGCCATCCAGCTGGCCCTGCAGGACAGCGGCAGCGAGGTGAACATC6TGACCGACAGCCA6TAC 

GCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAGCGAGAGCOAGCTGGTGAACCAGATCATCGAGCAG 

CTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAG 

ATCGACAAGCTGGTGAGCAAGGGCATCCGCAAGGTGCTC 
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CCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGAGCATCAAGGTGGGCGGCCAGATCAAGGAGGCCCTG 

CTGGCCACCGGCGCCGACGACACCGTGCTGGAGGAGATGAGCCTGCCCGGCAAGTG6AAGCCCAAGATG 

ATCGGCGGCATCGGCGGCTTCATCAAGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAG 

AAGQCCATCGGCACCGTGCTCATCGGCCCCACCCCCGTGAACATCATCGGCCGCAACATGCTGACCCAG 

CTGGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACCGTGCCCGTGAAGCTG^ 

GACGGCCCCAAGGTGAAGCAGTGGCCCCTQACCGAGGAGAAGATCAAGGCCCTGACCGCCATCTGCGAG 

GAGATGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCC 

ATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAG 

GACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGT^GAAGAAGAAGAGCGTGACCGTG 

CTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACGAGGACTTCCGCAAGTACACCGCCTTCACC 

ATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGCTGCCCCAGGGCTGGAAG 

GGCAGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATCCTGGAGCCCTTCCGCGCCCGCAACCCCGAG 

ATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAGCGACCTGGAGATCGGCCAGCACCGCGCCAAGATC 

GAGGAGCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCC 

CCCTTCCTGCCCATCGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCGAGCTGCCCGAGAAGGAG 

AGCTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTACCCCGGC 

ATCAAGGTGCGCCAGCTGTGCAAGCTGCTGCGCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACC 

GAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGCGCGAGCCCGTGCACGGCGTGTACTAC 

GACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAAGCAGGGCCACGACCAGTGGACCTACCAGATCTAC 

CAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCAAGATGCGCACCGCCCACACCAACGACGTG 

AAGCAGCTGACCGAGGCCGTGCAGAAGATCGCCATGGAGAGCATCGTGATCTGG6GCAAGACCCCCAAG 

TTCCGCCTGCCCATCCAGAAGGAGACCTGGGAGACCTGGTGGACCGACTACTGGCAGGCCACC^ 

CCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCA6CTGGAGAAGGAGCCCATC 

ATCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACCAAGATCGGCAAGGCCGGCTAC 

GTGACCGACCGGGGCCGGCAGAAGATCGTGAGCCTGACCGAGACCACCAACCAGAAGACCGAGCTGCAG 

GCCATCCAGCTGGCCCTGCAGGACAGCGGCAGCGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTG 

GGCATCATCCAGGCCCAGCCCGACAAGAGCGAGAGCGAGCTGGTGAACCAGATCATCGAGCAGCTGATC 

AAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGATCGAC 

AAGCTGGTGAGCAAGGGCATCCGCAAGGTGCTC 
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GCCACCATGCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGAGCATCAAGGTGGGCGGCCAGATCAAG 
GAGGCCCTGCTGGACACCGGCGCCGACGACACCGTGCTGGAGGAGATGAGCCTGCCCGGCAAGTGGAAG 
CCCAAGATGATCGGCGGCATCGGCGGCTTCATO^GGTGCGCCAGTACGACCAGATCCTGATCGAGATC 
TGCGGCAAGAAGGCCATCGGCACCGTGCTGATCGGCCCCACCCCCGTGAACATCATCGGCCGCAACATG 
CTGACCCAGCTGGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACCGTGCCCGTGAAGCTGAAG 
CCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGCC 
ATCTGCGAGGAGATGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAACCCCTACAACACCCCC 
GTGTTCGCCATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAG 
CGCACCCAGGACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGTUVGAAGAAGAAGAGC 
GTGACCGTGCTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACGAGGACTTCCGCAAGTACACC 
GCCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGCTGCCCCAG 
GGCTGGAAGGGCAGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATCCTGGAGCCCTTCCGCGCCCGC 
AACCCCGAGATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAGCGACCTGGAGATCGGCCAGCACCGC 
GCCAAGATCGAGGAGCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAG 
AAGGAGCCCCCCTTCCTGCCCATCGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCGAGCTGCCC 
GAGAAGGAGAGCTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATC 
TACCCCGGCATCAAGGTGCGCCAGCTGTGCAAGCTGCTGCGCGGCGCCAAGGCCCTGACCGACATCGTG 
CCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGCGCGAGCCCGTGCACGGC 
GTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAAGCAGGGCCACGACCAGTGGACCTAC 
CAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCAAGATGCGCACCGCCCACACC 
AACGACGTGAAGCAGCTGACCGAGGCCGTGCAGAAGATCGCCATGGAGAGCATCGTGATCTGGGGCAAG 
ACCCCCAAGTTCCGCCTGCCCATCCAGAAGGAGACCTGGGAGACCTGGTGGACCGACTACTGGCAGGCC 
ACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTG6TACCAGCTGGAGAAG 
GAGCCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACCAAGATCGGCAAG 
GCCGGCTACGTGACCGACCGGGGCCGGCAGAAGATCGTGAGCCTGACCGAGACCACCAACCAGAAGACC 
GAGCTGCAGGCCATCCAGCTGGCCCTGCAGGACAGCGGCAGCGAGGTGAACATCGTGACCGACAGCCAG 
TACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAGCGAGAGCGAGCTGGTGAACCAGATCATCGAG 
CAGCTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCACAAGGGCATC6GCGGCAACGAG 
CAGATCGACAAGCTGGTGAGCAAGGGCATCCGCAAGGTGCTCGAATTCGAGCCCGTGGACCCCAACCTG 
GAGCCCTGGAACCACCCCGGCAGCCAGCCCAAGACCGCCGGCAACAAGTGCTACTGCAAGCACTGCAGC 
TACCACTGCCTGGTGAGCTTCCAGACCAAGGGCCTGGGCATCAGCTACGGCCGCAAGAAGCGCCGCCAG 
CGCCGCAGCGCCCCCCCCAGCAGCGAGGACCACCAGAACCCCATCAGCAAGCAGCCCCTGCCCCAGACC 
CGCGGCGACCCCACCGGCAGCGAGGAGAGCAAGAAGAAGGTGGAGAGCAAGACCGAGACCGACCCCTTC 
GACCCCGGGGCCGGCCGCAGCGGCGACAGCGACGAGGCCCTGCTGCAGGCCGTGCGCATCATCAAGATC 
CTGTACCAGAGCAACCCCTACCCCAAGCCCGAGGGCACCCGCCAGGCCGACCTGAACCGCCGCCGCCGC 
TGGCGCGCCCGCCAGCGCCAGATCCACAGCATCAGCGAGCGCATCCTGAGCACCTGCCTGGGCCGCCCC 
GCCGAGCCCGTGCCCTTCCAGCTGCCCCCCGACCTGCGCCTGCACATCGACTGCAGCGAGAGCAGCGGC 
ACCAGCGGCACCCAGCAGAGCCAGGGCACCACCGAGGGCGTGGGCAGCCCCCTCGAGGCCGGCAAGTGG 
AGCAAGAGCAGCATCGTGGGCTGGCCCGCCGTGCGCGAGCGCATCCGCCGCACCGAGCCCGCCGCCGAG 
GGCGTGGGCGCCGCCAGCCAGGACCTGGACAA6CACGGCGCCCTGACCAGCAGCAACACCGCCGCCAAC 
AACGCCGACTGCGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAG 
GTGCCCCTGCGCCCCATGACCTACAAGGCCGCCTTCGACCTGAGCTTCTTCCTGAAGGAGAAGGGCGGC 
CTGGAGGGCCTGATCTACAGCAAGAAGCGCCAGGAGATCCTGGACCTGTGGGTGTACCACACCCAGGGC 
TTCTTCCCCGGCTGGCAGAACTACACCCCCGGCCCCGGC6TGCGCTACCCCCTGACCTTCGGCTGGTGC 
TTCAAGCTG6TGCCCGTGGACCCCCGCGAGGTGGAGGAGQCCAACAAGGGCGAGAACAACTGCCTGCTG 
CACCCCATGAGCCAGCACGGCATGGAGGACGAGGACCGCGAGGTGCTGAAGTGGAAGTTCGACAGCAGC 
CTGGCCCGCCGCCACATGGCCCGCGAGCTGCACCCCGAGTACTACAAGGACTGCGCCTAA 
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ATGGCCGGCCGCAGCGGCGACAGCGACGAGGCCCTGCTGCAGGCCGTGCGCATCATCAAGATCCTGTAC 
CAGAGCAACCCCTACCCCAAGCCCGAGGGCACCCGCCAGGCCGACCTGAACCGCCGCCGCCGCTGGCGC 
GCCCGCCAGCGCCAGATCCACAGCATCAGCGAGCGCATCCTGAGCACCTGCCOXSGGCCGCCCCGCCGAG 
CCCGTGCCCTTCCAGCTGCCCCCCGACCTGCGCCTGCACATCGACTGCAGCGAGAGCAGCG6CACCAGC 
6GCACCCAGCAGAGCCAGGGCACCACCGAGGGCGTGGGCAGCCCC 
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ATGGAGCCCGTGGACCCCAACCTGGAGCCCTGGAACCACCCCGGCAGCCAGCCCAAGACCGCCGGCAAC 
AAGTGCTACTGCAAGCACTGCAGCTACCACTGCCTGGTGAGCTTCCAGACCAAGGGCCTGGGCATCAGC 
TACGGCCGCAAGAAGCGCCGCCAGCGCCGCAGCGCCCCCCCCAGCAGCGAGGACCACCAGAACCCCATC 
A6CAAGCAGCCCCTGCCCCAGACCCGC6GCGACCCCACC66CA6CGAG6AGAGCAAGAAGAAG6TGGAG 
AGCAAGACCGAGACCGACCCCTTCGAC 
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ATGGAGCCCGTGGACCCCAACCTGGAGCCCTGGAACCACCCCGGCAGCCAGCCCAAGACCGCCTGCAAC 
AAGTGCTACTGCAAGCACTGCAGCTACCACTGCCTGGTGAGCTTCCAGACCAAGGGCCTGGGCATCAGC 
TACGGCCGCAAGAAGCGCCGCCAGCGCCGCAGCGCCCCCCCCAGCAGCGAGGACCACCAGAACCCCATC 
AGCAAGCAGCCCCTGCCCCAGACCCGCGGCGACCCCACCGGCAGCGAGGAGAGCAAGAAGAAGGTGGAG 
AGCAAGACCGAGACCGACCCCTTCGAC 
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ATGGAGCCCGTGGACCCCAACCTGGAGCCCTGGAACCACCCCGGCAGCCAGCCCAAGACCGCCTGCAAC 

AAGTGCTACTGCAAGCACTGCAGCTACCACTGCCTGGTGTGCTTCCAGACCAAGGGCCTGGGCATCAGC 

TACGGCCGCAAGAAGCGCCGCCAGCGCCGCAGCGCCCCCCCCAGCAGCGAGGACCACCAGAACCCCATC 

AGCAAGCAGCCCCTGCCCCAGACCCGCGGCGACCCCACCGGCAGCGAGGAGAGCAAGAAGAAGGTGGAG 

AGCAAGACCGAGACCGACCCCTTCGACCCCGGGGCCGGCCGCAGCGGCGACAGCGACGAGGCCCTGCTG 

CAGGCCGTGCGCATCATCAAGATCCTGTACCAGAGCAACCCCTACCCCAAGCCCGAGGGCACCCGCCAG 

GCCCGCAAGAACCGCCGCCGCCGCTGGCGCGCCCGCCAGCGCCAGATCCACAGCATCAGCGAGCGCATC 

CTGAGCACCTGCCTGGGCCGCCCCGCCGAGCCCGTGCCCTTCCAGCTGCCCCCCATCGAGCGCCTGCAC 

ATCGACTGCAGCGAGAGCAGCGGCACCAGCGGCACCCAGCAGAGCCAGGGCACCACCGAGGGCGTGGGC 

AGCCCCCTCGAGGGCGGCAAGTGGAGCAAGAGCAGCATCGTGGGCTGGCCCGCCGTGCGCGAGCGCATC 

CGCCGCACCGAGCCCGCCCGCGAGGGCGCCGCCGAGGGCGCCGCCGAGGGCGTGGGCGCCGCCAGCCAG 

GACCTGGACAAGCACGGCGCCCTGACCAGCAGCAACACCGCCGCCAACAACGCCGACTGCGCCTGGCTG 

GAGGCCCAGGAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACC 

TACAAGGCCGCCTTCGACCTGAGCTTCTTCCTCAAGGAGAAGGGCGGCCTGGAGGGCCTC^ 

AAGAAGCGCCAGGAGATCCTGGACCTGTGGGTGTACCACACCCAGGGCTTCTTCCCCGACTGGCAGAAC 

TACACCCCCGGCCCCGGCGTGCGCTACCCCCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGAC 

CCCCGCGAGGTGGAGGAGGCCAACAAGGGCGAGAACAACTGCCTGCTGCACCCCATGAGCCAGCACGGC 

ATGGAGGACGAGGACCGCGAGGTGCTGAAGTGGAA6TTCGACAGCAGCCTGGCCCGCCGCCACATGGCC 

CGCGAGCTGCACCCCGAGTACTACAAGGACTGC 
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ATGGAGCCCGTGGACCCCAACCTGGAGCCCTGGAACCACCCCGGCAGCCAGCCCAAGACCGCCGGCAAC 
AAGTGCTACTGCAAGCACTGCAGCTACCACTGCCTGGTGAGCTTCCAGACCAAGGGCCTGGGCATCAGC 
TACGGCCGCAAGAAGCGCCGCCAGCGCCGCAGCGCCCCCCCCAGCAGCGAGGACCACCAGAACCCCATC 
AGCAAGCAGCCCCTGCCCCAGACCCGCGGCGACCCCACCGGCAGCGAGGAGAGCAAGAAGAAGGTGGAG 
AGCAAGACCGAGACCGACCCCTTCGACCCCGGGGCCGGCCGCAGCGGCGACAGCGACGAGGCCCTGCTG 
CAGGCCGTGCGCATCATCAAGATCCTGTACCAGAGCAACCCCTACCCCAAGCCCGAGGGCACCCGCCAG 
GCCGACCTGAACCGCCGCCGCCGCTGGCGCGCCCGCCAGCGCCAGATCCACAGCATCAGCGAGCGCATC 
CTGAGCACCTGCCTGGGCCGCCCCGCCGAGCCCGTGCCCTTCCAGCTGCCCCCCGACCTGCGCCTGCAC 
ATCGACTGCAGCGAGAGCAGCGGCACCAGCGGCACCCAGCAGAGCCAGGGCACCACCGAGGGCGTGGGC 
AGCCCCCTCGAGGCCGGCAAGTGGAGCAAGAGCAGCATCGTGGGCTGGCCCGCCGTGCGCGAGCGCATC 
CGCCGCACCGAGCCCGCCGCCGAGGGCGTGGGCGCCGCCAGCCAGGACCTGGACAAGCACGGCGCCCTG 
ACCAGCAGCAACACCGCCGCCAACAACGCCGACTGCGCCTGGCTGGAGGCCCAGGAGGAGGAGGAGGAG 
GTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGC6CCCCATGACCTACAAGGCCGCCTTCGACCTGAGC 
TTCTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACAGCAAGAAGCGCCAGGAGATCCTGGAC 
CTGTGGGTGTACCACACCCAGGGCTTCTTCCCCGGCTGGCAGAACTACACCCCCGGCCCCGGCGTGCGC 
TACCCCCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCCGCGAGGTGGAGGAGGCCAAC 
AAGGGCGAGAACAACTGCCTGCTGCACCCCATGAGCCAGCACGGCATGGAGGACGAGGACCGCGAGGTG 
CTGAAGTGGAAGTTCGACAGCAGCCTGGCCCGCCGCCACATGGCCCGCGAGCTGCACCCCGAGTACTAC 
AAGGACTGCGCCTAA 
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GCCACCATGGAGCCCGTGGACCCCAACCTGGAGCCCTGGAACCACCCCGGCAGCCAGCCCAAGACCGCC 
GGCAACAAGTGCTACTGCAAGCACTGCAGCTACCACTGCCTGGTGAGCTTCCAGACCAAGGGCCTGGGC 
ATCAGCTACGGCCGCAAGAAGCGCCGCCAGCGCCGCAGCGCCCCCCCCAGCAGCGAGGACCACCAGAAC 
CCCATCAGCAAGCAGCCCCTGCCCCAGACCCGCGGCGACCCCACCGGCAGCGAGGAGAGCAAGAAGAAG 
GTGGAGAGCAAGACCGAGACCGACCCCTTCGACCCCGGGGCCGGCCGCAGCGGCGACAGCGACGAGGCC 
CTGCTGCAGGCCGTGCGCATCATCAAGATCCTGTACCAGAGCAACCCCTACCCCAAGCCCGAGGGCACC 
CGCCAGGCCGACCTGAACCGCCGCCGCCGCTGGCGCGCCCGCCAGCGCCAGATCCACAGCATCAGCGAG 
CGCATCCTGAGCACCTGCCTGGGCCGCCCCGCCGAGCCCGTGCCCTTCCAGCTGCCCCCCGACCTGCGC 
CTGCACATCGACTGCAGCGAGAGCAGCGGCACCAGCGGCACCCAGCAGAGCCAGGGCACCACCGAGGGC 
GTGGGCAGCCCCCTCGAGGCCGGCAAGTGGAGCAAGAGCAGCATCGTGGGCTGGCCCGCCGTGCGCGAG 
CGCATCCGCCGCACCGAGCCCGCCGCCGAGGGCGTGGGCGCCGCCAGCCAGGACCTGGACAAGCACGGC 
GCCCTGACCAGCAGCAACACCGCCGCCAACAACGCCGACTGCGCCTGGCTGGAGGCCCAGGAGGAGGAG 
GAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAGGCCGCCTTCGAC 
CTGAGCTTCTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACAGCAAGAAGCGCCAGGAGATC 
CTGGACCTGTGGGTGTACCACACCCAGGGCTTCTTCCCCGGCTGGCAGAACTACACCCCCGGCCCCGGC 
GTGCGCTACCCCCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCCGCGAGGTGGAGGAG 
GCCAACAAGGGCGAGAACAACTGCCTGCTGCACCCCATGAGCCAGCACGGCATGGAGGACGAGGACCGC 
GAGGTGCTGAAGTGGAAGTTCGACAGCAGCCTGGCCCGCCGCCACATGGCCCGCGAGCTGCACCCCGAG 
TACTACAAGGACTGCGAATTCGGCGCCCGCGCCAGCATCCTGCGCGGCGGCAAGCTGGACGCCTGGGAG 
CGCATCCGCCTGCGCCCCGGCGGCAAGAAGTGCTACATGATGAAGCACCTGGTGTGGGCCAGCCGCGAG 
CTGGAGAAGTTCGCCCTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCAAGCAGATCATCCGCCAG 
CTGCACCCCGCCCTGCAGACCGGCAGCGAGGAGCTGAAGAGCCTGTTCAACACCGTGGCCACCCTGTAC 
TGCGTGCACGAGAAGATCGAGGTCCGCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAAC 
AAGTGCCAGCAGAAGATCCAGCAGGCCGAG6CCGCCGACAAGGGCAAGGTGA6CCAGAACTACCCCATC 
GTGCAGAACCTGCAGGGCCAGATG6TGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAG 
GTGATCGAGGAGAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGA6CGAGGGCGCCACC 
CCCCAGGACCTGAACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCTGAAGGAC 
ACCATCAACGAGGAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGC 
CAGATGCGCGAGCCCCGCGGCAGCGACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGCCTGG 
ATGACCAGCAACCCCCCCATCCCCGTGGGCGACATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAG 
ATCGTGCGGATGTACAGCCCCGTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGAC 
TACGTGGACCGCTTCTTCAAGACCCTGCGCGCCGAGCAGAGCACCCAGGAGGTGAAGAACTGGATGACC 
GACACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCTCGGCCCCGGCGCC 
AGCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAAGGCCCGCGTGCTGGCC 
GAGGCGATGAGCCAGGCCAACACCAGCGTGATGATGCAGAAGAGCAACTTCAAGGGCCCCCGGCGCATC 
GTCAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGC 
TGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGC 
AAGATCTGGCCCAGCCACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCC 
CCCGCCGAGAGCTTCCGCTTCGAGGAGACCACCCCCGGCCAGAAGCAGGAGAGCAAGGACCGCGAGACC 
CTGACCAGCCTGAAGAGCCTGTTCGGCAACGACCCCCTGAGCCAAGCCTAA 
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TatRevNefgagCpolZna_C 

GCCACCATGGAGCCCGTGGACCCCAACCTGGAGCCCTGGAACCACCCCGGCAGCCAGCCCAAGACCGCC 

GGCAACAAGTGCTACTGCAAGCACTGCAGCTACCACTGCCTGGTGAGCTTCCAGACCAAGGGCCTGGGC 

ATCAGCTACGGCCGCAAGAAGCGCCGCCAGCGCCGCAGCGCCCCCCCCAGCAGCGAGGACCACCAGAAC 

CCCATCAGCAAGCAGCCCCTGCCCCAGACCCGCGGCGACCCCACCGGCAGCGAGGAGAGCAAGAAGAAG 

GTGGAGAGCAAGACCGAGACCGACCCCTTCGACCCCGGGGCCGGCCGCAGCGGCGACAGCGACGAGGCC 

CTGCTGCAGGCCGTGCGCATCATCAAGATCCTGTACCAGAGCAACCCCTACCCCAAGCCCGAGGGCACC 

CGCCAGGCCGACCTGAACCGCCGCCGCCGCTGGCGCGCCCGCCAGCGCCAGATCCACAGCATCAGCGAG 

CGCATCCTGAGCACCTGCCTGGGCCGCCCCGCCGAGCCCGTGCCCTTCCAGCTGCCCCCCGACCTGCGC 

CTGCACATCGACTGCAGCGAGAGCAGCGGCACCAGCGGCACCCAGCAGAGCCAGGGCACCACCGAGGGC 

GTGGGCAGCCCCCTCGAGGCCGGCAAGTGGAGCAAGAGCAGCATCGTGGGCTGGCCCGCCGTGCGCGAG 

CGCATCCGCCGCACCGAGCCCGCCGCCGAGGGCGTGGGCGCCGCCAGCCAGGACCTGGACAAGCACGGC 

GCCCTGACCAGCAGCAACACCGCCGCCAACAACGCCGACTGCGCCTGGCTGGAGGCCCAGGAGGAGGAG 

GAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAGGCCGCCTTCGAC 

CTGAGCTTCTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACAGCAAGAAGCGCCAGGAGATC 

CTGGACCTGTGGGTGTACCACACCCAGGGCTTCTTCCCCGGCTGGOlGAACTACACrc^ 

GTGCGCTACCCCCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCCGCGAGGTGGAGGAG 

GCCAACAAGGGCGAGAACAACTGCCTGCOXSCACCCCATGAGCCAGCACGGCATGGAGGACGAGGACCGC 

GAGGTGCTGAAGTGGAAGTTCGACAGCAGCCTGGCCCGCCGCCACATGGCCCGCGAGCTGCACCCCGAG 

TACTACAAGGACTGCCTCGAGGGCGCCCGCGCCAGCATCCTGCGCGGCGGCAAGCTGGACGCCTGGGAG 

CGCATCCGCCTGCGCCCCGGCGGCAAGAAGTGCTACATGATGAAGCACCTGGTGTGGGCCAGCCGCGAG 

CTGGAGAAGTTCGCCCTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCAAGCAGATCATCCGCCAG 

CTGCACCCCGCCCTGCAGACCGGCAGCGAGGAGCTGAAGAGCCTGTTCAACACCGTGGCCACCCTGTAC 

TGCGTGCACGAGAAGATCGAGGTCCGCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAAC 

AAGTGCCAGCAGAAGATCCAGCAGGCCGAGGCCGCCGACAAGGGCAAGGTGAGCCAGAACTACCCCATC 

GTGCAGAACCTGCAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGT6AAG 

GTGATCGAGGAGAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCACC 

CCCCAGGACCTGAACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCTGAAGGAC 

ACCATCAACGAGGAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGC 

CAGATGCGCGAGCCCCGCGGCAGCGACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGCCTGG 

ATGACCAGCAACCCCCCCATCCCCGTGGGCGACATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAG 

ATCGTGCGGATGTACAGCCCCGTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGAC 

TACGTGGACCGCTTCTTCAAGACCCTGCGCGCCGAGCAGAGCACCCAGGAGGTGAAGAACTGGATGACC 

GACACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCTCGGCCCCGGCGCC 

AGCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAAGGCCCGCGTGCTGGCC 

GAGGCGATGAGCCAGGCCAACACCAGCGTGATGATGCAGAAGAGCAACTTCAAGGGCCCCCGGCGCATC 

GTCAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGC 

TGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGC 

AAGATCTGGCCCAGCCACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCC 

CCCGCCGAGAGCTTCCGCTTCGAGGAGACCACCCCCGGCCAGAAGCAGGAGAGCAAGGACCGCGAGACC 

CTGACCAGCCTGAAGAGCCTGTTCGGCAACGACCCCCTGAGCCAAGAATTCGCCGAGGCCATGAGCCAG 

GCCACCAGCGCCAACATCCTGATGCAGCGCAGCAACTTCAAGGGCCCCAAGCGCATCATCAAGTGCTTC 

AACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGCTGCTGGAAGTGC 

GGCAAGGAGGGCCACCAGATGAAGGACTGCACCGA6CGCCAGGCCAACTTCTTCCGCGAGGACCTGGCC 

TTCCCCCAGGGCAAGGCCCGCGAGTTCCCCAGCGAGCAGAACCGCGCCAACAGCCCCACCAGCCGCGAG 

CTGCAGGTGCGCGGCGACAACCCCCGCAGCGAGGCCGGCGCCGAGCGCCAGGGCACCCTGAACTTCCCC 

CAGATCACCCTGTGGCAGCGCCCCCTGGTGAGCATCAAGGTGGGCGGCCAGATCAAGGAGGCCCTGCTG 

GCCACCGGCGCCGACGACACCGTGCTGGAGGAGATGAGCCTGCCCGGCAAGTGGAAGCCCAAGATGATC 

GGCGGCATCGGCGGCTTCATCAAGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAAG 

GCCATCGGCACCGTGCTGATCGGCCCCACCCCCGTGAACATCATCGGCCGCAACATGCTGACCCAGCTG 

GGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACCGTGCCCGTGAAGCTGAAGCCCGGCATGGAC 

GGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGCCATCTGCGAGGAG 

ATGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATC 

7VAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGAC 

TTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGCTG 
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GACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACGAGGACTTCCGCAAGTACACCGCCTTCACCATC 
CCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAACGTGCTGCCCCAGGGCTGGAAGGGC 
AGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATCCTCGAGCCCTTCCGCGCCCGCAACCCCGAGATC 
GTGATCTACCAGGCCCCCCTGTACGTGGGCAGCGACCTGGAGATCGGCCAGCACCGCGCCAAGATCGAG 
GAGCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCC 
TTCCTGCCCATCGAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCGAGCTGCCCGAGAAGGAGAGC 
TGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTACCCCGGCATC 
AAGGTGCGCCAGCTGTGCAAGCTGCTGCGCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGAG 
GAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGCGCGAGCCCGTGCACGGCGTGTACTACGAC 
CCCAGCAAGGACCTGGTGGCCGAGATCCAGAAGCAGGGCCACGACCAGTGGACCTACCAGATCTACCAG 
GAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCAAGATGCGCACCGCCCACACCAACGACGTGAAG 
CAGCTGACCGAGGCCGTGCAGAAGATCGCa^TGGAGAGCATCGTGATCTGGGGCAAGACCCCCAAGTTC 
CGCCTGCCCATCCAGAAGGAGACCTGGGAGACCTGGTGGACCGACTACTGGCAGGCCACCTGGATCCCC 
GAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCATC 
GGCGCCGAGAGCTTCTACGTGGACGGCGCCGCCAACCGCGAGACCAAGATCGGCAAGGCCGGCTACGTG 
ACCGACCGGGGCCGGCAGAAGATCGTGAGCCTGACCGAGACCACCAACCAGAAGACCGAGCTGCAGGCC 
ATCCAGCTGGCCCTGCAGGACAGCGGCAGCGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGC 
ATCATCCAGGCCCAGCCCGACAAGAGCGAGAGCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAG 
AAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGATCGACAAG 
CTGGTGAGCAAGGGCATCCGCAAGGTGCTGTTCCTGGACGGCATCGATGGCGGCATCGTGATCTACCAG 
TACATGGACGACCTGTACGTGGGCAGCGGCGGCCCTAGGATCGATTAAAAGCTTCCCGGGGCTAGCACC 
GGTTCTAGA 
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Tat RevSIe f GagProt XnaHTmut^C 

GCCACCATGGAGCCCGTGGACCCCAACCTGGAGCCCTGGAACCACCCCGGCAGCCAGCCCAAGACCGCC 

GGCAACAAGTGCTACTGCAAGCACTGCAGCTACCACTGCCTGGTGAGCTTCCAGACCAAGGGCCTGGGC 

ATCAGCTACGGCCGCAAGAAGCGCCGCCAGCGCCGCAGCGCCCCCCCCAGCAGCGAGGACCACCAGAAC 

CCCATCAGCAAGCAGCCCCTGCCCCAGACCCGCGGCGACCCCACCGGCAGCGAGGAGAGCAAGAAGAAG 

GTGGAGAGCAAGACCGAGACCGACCCCTTCGACCCCGGGGCCGGCCGCAGCGGC6ACAGCGACGAGGCC 

CTGCTGCAGGCCGTGCGCATCATCAAGATCCTGTACCAGAGCAACCCCTACCCCAAGCCCGAGGGCACC 

CGCCAGGCCGACCTGAACCGCCGCCGCCGCTGGCGCGCCCGCCAGCGCCAGATCCACAGCATCAGCGAG 

CGCATCCTGAGCACCTGCCTGGGCCGCCCCGCCGAGCCCX5tGCCCTTCCAGCTGCCCCCCGACCTGCGC 

CTGCACATCGACTGCAGCGAGAGCAGCGGCACCAGCGGCACCCAGCAGAGCCAGGGCACCACCGAGGGC 

GTGGGCAGCCCCCTCGAGGCCGGCAAGTGGAGCAAGAGCAGCATCGTGGGCTGGCCCGCCGTGCGCGAG 

CGCATCCGCCGCACCGAGCCCGCCGCCGAGGGCGTGGGCGCCGCCAGCCAGGACCTGGACAAGCACGGC 

GCCCTGACCAGCAGCAACACCGCCGCCAACAACGCCGACTGCGCCTGGCTGGAGGCCCAGGAGGAGGAG 

GAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAGGCCGCCTTCGAC 

CTGAGCTTCTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACAGCAAGAAGCGCCAGGAGATC 

CTGGACCTGTGGGTGTACCACACCCAGGGCTTCTTCCCCGGCTGGCAGAACTACACCCXJCGGCCCCGGC 

GTGCGCTACCCCCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCCGCGAGGTG^ 

GCCAACAAGGGCGAGAACAACTGCCTGCTGCACCCCATGAGCCAGCACGGCATGGAGGACGAGGACCGC 

GAGGTGCTGAAGTGGAAGTTCGACAGCAGCCTGGCCCGCCGCCACATGGCCCGCGAGCTGCACCCCGAG 

TACTACAAGGACTGCAAGCTTGGCGCCCGCGCCAGCATCCTGCGCGGCGGCAAGCTGGACGCCTGGGAG 

CGCATCCGCCTGCGCCCCGGCGGCAAGAAGTGCTACATGATGAAGCACCTGGTGTGGGCCAGCCGCGAG 

CTGGAGAAGTTCGCCCTGAACCCCGGCCTGCTGGAGACCAGCGAGGGCTGCAAGCAGATCATCCGCCAG 

CTGCACCCCGCCCTGCAGACCGGCAGCGAGGAGCTGAAGAGCCTGTTCAACACCGTGGCCACCCTGTAC 

TGCGTGCACGAGAAGATCGAGGTCCGCGACACCAAGGAGGCCCTGGACAAGATCGAGGAGGAGCAGAAC 

AAGTGCCAGCAGAAGATCCAGCAGGCCGAGGCCGCCGACAAGGGCAAGGTGAGCCAGAACTACCCCATC 

GTGCAGAACCTGCAGGGCCAGATGGTGCACCAGGCCATCAGCCCCCGCACCCTGAACGCCTGGGTGAAG 

GTGATCGAGGAGAAGGCCTTCAGCCCCGAGGTGATCCCCATGTTCACCGCCCTGAGCGAGGGCGCCACC 

CCCCAGGACCTGAACACGATGTTGAACACCGTGGGCGGCCACCAGGCCGCCATGCAGATGCTGAAGGAC 

ACCATCAACGAGGAGGCCGCCGAGTGGGACCGCGTGCACCCCGTGCACGCCGGCCCCATCGCCCCCGGC 

CAGATGCGCGAGCCCCGCGGCAGCGACATCGCCGGCACCACCAGCACCCTGCAGGAGCAGATCGCCTGG 

ATGACCAGCAACCCCCCCATCCCCGTGGGCGACATCTACAAGCGGTGGATCATCCTGGGCCTGAACAAG 

ATCGTGCGGATGTACAGCCCCGTGAGCATCCTGGACATCAAGCAGGGCCCCAAGGAGCCCTTCCGCGAC 

TACGTGGACCGCTTCTTCAAGACCCTGCGCGCCGAGCAGAGCACCCAGGAGGTGAAGAACTGGATGACC 

GACACCCTGCTGGTGCAGAACGCCAACCCCGACTGCAAGACCATCCTGCGCGCTCTCGGCCCCGGCGCC 

AGCCTGGAGGAGATGATGACCGCCTGCCAGGGCGTGGGCGGCCCCAGCCACAAGGCCCGCGTGCTGGCC 

GAGGCGATGAGCCAGGCCAACACCAGCGTGATGATGCAGAAGAGCAACTTCAAGGGCCCCCGGCGCATC 

GTCAAGTGCTTCAACTGCGGCAAGGAGGGCCACATCGCCCGCAACTGCCGCGCCCCCCGCAAGAAGGGC 

TGCTGGAAGTGCGGCAAGGAGGGCCACCAGATGAAGGACTGCACCGAGCGCCAGGCCAACTTCCTGGGC 

AAGATCTGGCCCAGCCACAAGGGCCGCCCCGGCAACTTCCTGCAGAGCCGCCCCGAGCCCACCGCCCCC 

CCCGCCGAGAGCTTCCGCTTCGAGGAGACCACCCCCGGCCAGAAGCAGGAGAGCAAGGACCGCGAGACC 

CTGACCAGCCTGAAGAGCCTGTTCGGCAACGACCCCCTGAGCCAGAAAGAATTCCCCCAGATCACCCTG 

TGGCAGCGCCCCCTGGTGAGCATCAAGGTGGGCGGCCAGATCAAGGAGGCCCTGCTGGCCACCGGCGCC 

GACGACACCGTGCTGGAGGAGATGAGCCTGCCCGGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGC 

GGCTTCATCAAGGTGCGCCAGTACGACCAGATCCTGATCGAGATCTGCGGCAAGAAG6CCATCGGCACC 

GTGCTGATCGGCCCCACCCCCGTGAACATCATCGGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTG 

AACTTCCCCATCAGCCCCATCGAGACCGTGCCCGTGAAGCTGAAGCCCGGCATGGACGGCCCCAAGGTG 

AAGCAGTGGCCCCTGACCGAGGAGAAGATCAAGGCCCTGACCGCCATCTGCGAGGAGATGGAGAAGGAG 

GGCAAGATCACCAAGATCGGCCCCGAGAACCCCTACAACACCCCCGTGTTCGCCATCAAGAAGAAGGAC 

AGCACCAAGTGGCGCAAGCTGGTGGACTTCCGCGAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTG 

CAGCTGGGCATCCCCCACCCCGCCGGCCTGAAGAAGAAGAAGAGCGTGACCGTGCTGGACGTGGGCGAC 

GCCTACTTCAGCGTGCCCCTGGACGAGGACTTCCGCAAGTACyVCCGCCTTCACCATCCCCAGCATCAAC 

AACGAGACCCCCGGCATCCGCTACCAGTACAACGTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATC 

TTCCAGAGCAGCATGACCAAGATCCTGGAGCCCTTCCGCGCCCGCAACCCCGAGATCGTGATCTACCA^ 

GCCCCCCTGTACGTGGGCAGCGACCTGGAGATCGGCCAGCACCGCGCCAAGATCGAGGAGCTGCGCAAG 

CACCTGCTGCGCTGGGGCTTCACCACCCCCGACAAGAAGCACCAGAAGGAGCCCCCCTTCCT6CCCATC 
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GAGCTGCACCCCGACAAGTGGACCGTGCAGCCCATCX3AGCTGCCCGAGAAGGAGAGCTGGACCGTGAAC 

GACATCCAGAAGCTGGTGGGCAAGCTGAACTGGGCCAGCCAGATCTACCCCGGCATCyUlGG^ 

CTGTGCAAGCTGCTGCGCGGCGCCAAGGCCCTGACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTG 

GAGCTGGCCGAGAACCGCGAGATCCTGCGCGAGCCCGTGCACGGCGTGTACTACGACCCCAGCAAGGAC 

CTGGTGGCCGAGATCCAGAAGCAGGGCCACGACCAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAG 

AACCTGAAGACCGGCAAGTACGCCAAGATGCGCACCGCCCACACCAACGACGTGAAGCAGCTGACCGAG 

GCCGTGCAGAAGATCGCCATGGAGAGCATCGTGATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCATC 

CAGAAGGAGACCTGGGAGACCTGGTGGACCGACTACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTC 

GTGAACACCCCCCCCCTGGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACC 

TTCTACGTGGACGGCGCCGCCAACCGCGAGACCAAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGC 

CGGCAGAAGATCGTGAGCCTGACCGAGACCACCAACCAGAAGACCGAGCTGCAGGCCATCCAGCTGGCC 

CTGCAGGACAGCGGCAGCGAGGTGAACATCGTGACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCC 

CAGCCCGACAAGAGCGAGAGCGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTG 

TACCTGAGCTGGGTGCCCGCCCACAAGGGCATCGGCGGCAACGAGCAGATCGACAAGCTGGTGAGCAAG 

GGCATCCGCAAGGTGCTCTAA 
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TatRevNef.ProtRT.opt_C 

GCCACCATGGAGCCCGTGGACCCCAACCTGGAGCCCTGGAACCACCCCGGCAGCCAGCCCAAGACCGCC 

GGCAACAAGTGCTACTGCAAGCACTGCAGCTACCACTGCCTGGTGAGCTTCCAGACCAAGGGCCTGGGC 

ATCAGCTACGGCCGCAAGAAGCGCCGCCAGCGCCGCAGCGCCCCCCCCAGCAGCGAGGACCACCAGAAC 

CCCATCAGCAAGCAGCCCCTGCCCCAGACCCGCGGCGACCCCACCGGCAGCGAGGAGAGCAAGAAGAAG 

GTGGAGAGCAAGACCGAGACCGACCCCTTCGACCCCGGGGCCGGCCGCAGCGGCGACAGCGACGAGGCC 

CTGCTGCAGGCCGTGCGCATCATCAAGATCCTGTACCAGAGCAACCCCTACCCCAAGCCCGAGGGCACC 

CGCCAGGCCGACCTGAACCGCCGCCGCCGCTGGCGCGCCCGCCAGCGCCAGATCCACAGCATCAGCGAG 

CGCATCCTGAGCACCTGCCTGGGCCGCCCCGCCGAGCCCGTGCCCTTCCAGCTGCCCCCCGACCTGCGC 

CTGCACATCGACTGCAGCGAGAGCAGCGGCACCAGCGGCACCCAGCAGAGCCAGGGCACCACCGAGGGC 

GTGGGCAGCCCCCTCGAGGCCGGCAAGTGGAGCAAGAGCAGCATCGTGGGCTGGCCCGCCGTGCGCGAG 

CGCATCCGCCGCACCGAGCCCGCCGCCGAGGGCGTGGGCGCCGCCAGCCAGGACCTGGACAAGCACGGC 

GCCCTGACCAGCAGCAACACCGCCGCCAACAACGCCGACTGCGCCTGGCTGGAGGCCCAGGAGGAGGAG 

GAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGTGCCCCTGCGCCCCATGACCTACAAGGCCGCCTTCGAC 

CTGAGCTTCTTCCTGAAGGAGAAGGGCGGCCTGGAGGGCCTGATCTACAGCAAGAAGCGCCAGGAGATC 

CTGGACCTGTGGGTGTACCACACCCAGGGCTTCTTCCCCGGCTGGCAGAACTACACCCCCGGCCCCGGC 

GTGCGCTACCCCCTGACCTTCGGCTGGTGCTTCAAGCTGGTGCCCGTGGACCCCCGCGAGGTGGAGGAG 

GCCAACAAGGGCGAGAACAACTGCCTGCTGCACCCCATGAGCCAGCACGGCATGGAGGACGAGGACCGC 

GAGGTGCTGAAGTGGAAGTTCGACAGCAGCCTGGCCCGCCGCCACATGGCCCGCGAGCTGCACCCCGAG 

TACTACAAGGACTGCGAATTCCCCCAGATCACCCTGTGGCAGCGCCCCCTGGTGAGCATCAAGGTGGGC 

GGCCAGATCAAGGAGGCCCTGCTGGACACCGGCGCCGACGACACCGTGCTGGAGGAGATGAGCCTGCCC 

GGCAAGTGGAAGCCCAAGATGATCGGCGGCATCGGCGGCTTCATCAAGGTGCGCCAGTACGACCAGATC 

CTGATCGAGATCTGCGGCAAGAAGGCCATCGGCACCGTGCTGATCGGCCCCACCCCCGTGAACATCATC 

GGCCGCAACATGCTGACCCAGCTGGGCTGCACCCTGAACTTCCCCATCAGCCCCATCGAGACCGTGCCC 

GTGAAGCTCAAGCCCGGCATGGACGGCCCCAAGGTGAAGCAGTGGCCCCTGACCGAGGAGAAGATC 

GCCCTGACCGCCATCTGCGAGGAGATGGAGAAGGAGGGCAAGATCACCAAGATCGGCCCCGAGAACCCC 

TACAACACCCCCGTGTTCGCCATCAAGAAGAAGGACAGCACCAAGTGGCGCAAGCTGGTGGACTTCCGC 

GAGCTGAACAAGCGCACCCAGGACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCCGGCCTGAAG 

AAGAAGAAGAGCGTGACCGTGCTGGACGTGGGCGACGCCTACTTCAGCGTGCCCCTGGACGAGGACTTC 

CGCAAGTACACCGCCTTCACCATCCCCAGCATCAACAACGAGACCCCCGGCATCCGCTACCAGTACAAC 

GTGCTGCCCCAGGGCTGGAAGGGCAGCCCCAGCATCTTCCAGAGCAGCATGACCAAGATCCTGGAGCCC 

TTCCGCGCCCGCAACCCCGAGATCGTGATCTACCAGGCCCCCCTGTACGTGGGCAGCGACCTGGAGATC 

GGCCAGCACCGCGCCAAGATCGAGGAGCTGCGCAAGCACCTGCTGCGCTGGGGCTTCACCACCCCCGAC 

AAGAAGCACCAGAAGGAGCCCCCCTTCCTGCCCATCGAGCTGCACCCCGACAAGTGGACCGTGCAGCCC 

ATCGAGCTGCCCGAGAAGGAGAGCTGGACCGTGAACGACATCCAGAAGCTGGTGGGCAAGCTGAACTGG 

GCCAGCCAGATCTACCCCGGCATCAAGGTGCGCCAGCTGTGCAAGCTGCTGCGCGGCGCCAAGGCCCTG 

ACCGACATCGTGCCCCTGACCGAGGAGGCCGAGCTGGAGCTGGCCGAGAACCGCGAGATCCTGCGCGAG 

CCCGTGCACGGCGTGTACTACGACCCCAGCAAGGACCTGGTGGCCGAGATCCAGAAGCAGGGCCACGAC 

CAGTGGACCTACCAGATCTACCAGGAGCCCTTCAAGAACCTGAAGACCGGCAAGTACGCCAAGATGCGC 

ACCGCCCACACCAACGACGTGAAGCAGCTGACCGAGGCCGTGCAGAAGATCGCCATGGAGAGCATCGTG 

ATCTGGGGCAAGACCCCCAAGTTCCGCCTGCCCATCCAGAAGGAGACCTGGGAGACCTGGTGGACCGAC 

TACTGGCAGGCCACCTGGATCCCCGAGTGGGAGTTCGTGAACACCCCCCCCCTGGTGAAGCTGTGGTAC 

CAGCTGGAGAAGGAGCCCATCATCGGCGCCGAGACCTTCTACGTGGACGGCGCCGCCAACCGCGAGACC 

AAGATCGGCAAGGCCGGCTACGTGACCGACCGGGGCCGGCAGAAGATCGTGAGCCTGACCGAGACCACC 

AACCAGAAGACCGAGCTGCA6GCCATCCAGCTGGCCCTGCAGGACAGCGGCAGCGAGGTGAACATCGTG 

ACCGACAGCCAGTACGCCCTGGGCATCATCCAGGCCCAGCCCGACAAGAGCGAGAGCGAGCTGGTGAAC 

CAGATCATCGAGCAGCTGATCAAGAAGGAGAAGGTGTACCTGAGCTGGGTGCCCGCCCACAAGGGCATC 

GGCGGCAACGAGCAGATCGACAAGCTGGTGAGCAAGGGCATCCGCAAGGTGCTCTAA 
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atgagagtgatggggacacagaagaattgtcaacaatggtggatatggggcatcttaggc 
ttctggatgctaatgatttgtaacacggaggacttgtgggtcacagtctactatggggta 
cctgtgtggagagacgcaaaaactactctattctgtgcatcagatgctaaagcatatgag 
acagaagtgcataatgtctgggctacacatgcctgtgtacccacagaccccaacccacaa 
gaaatagttttgggaaatgtaacagaaaattttaatatgtggaaaaatgacatggcagat 
cagatgcatgaggatgtaatcagtttatgggatcaaagcctaaagccatgtgtaaagttg 
accccactctgtgtcactttaaactgtacagatacaaatgttacaggtaatagaactgtt 
acaggtaatagtaccaataatacaaatggtacaggtatttataacattgaagaaatgaaa 
aattgctctttcaatgcaaccacagaattaagagataagaaacataaagagtatgcactc 
ttttatagacttgatatagtaccacttaatgagaatagtgacaactttacatatagatta 
ataaattgcaatacctcaaccataacacaagcctgtccaaaggtctcttttgacccgatt 
cctatacattactgtgctccagctggttatgcgattctaaagtgtaataataagacattc 
aatgggacaggaccatgttataatgtcagcacagtacaatgtacacatggaattaagcca 
gtggtatcaactcaattactgttaaatggtagtctagcagaagaagggataataattaga 
tctgaaaatttgacagagaataccaaaacaataatagtacaccttaatgaatctgtagag 
attaattgtacaagacccaacaataatacaagaaaaagtgtaaggataggaccaggacaa 
gcantctatgcaacaaatgatgtaataggaaacataagacaagcacattgtaacattagt 
acagatagatggaacaaaactttacaacaggtaatgaaaaaattaggagagcatttccct 
aataaaacaatacaatttaaaccacatgcaggaggggatctagaaattacaatgcatagc 
tttaattgtagaggagaatttttctattgtaatacatcaaacctgtttaatagcacatac 
cactctaataatggtacatacaaatacaatggtaattcaagctcacccatcacactccaa 
tgtaaaataaaacaaattgtacgcatgtggcaaggggtaggacaagcaacgtatgcccct 
cccattgcaggaaacataacatgtagatcaaacatcacaggaatactattgacacgtgat 
ggaggatttaacaccacaaacaacacagagacattcagacctggaggaggagatatgagg 
gataactggagaagtgaattatataaatataaagtagtagaaattaagccattgggaata 
gcacccactaaggcaaaaagaagagtggtgcagagagaaaaaagagcagtgggaatagga 
gctgtgttccttgggttcttgggagcagcaggaagcactatgggcgcagcgtcaataacg 
ctgacggtacaggccagacaactgttgtctggtatagtgcaacagcaaagcaatttgctg 
aaggctatagaggcgcaacagcatatgttgcaactcacagtctggggcattaagcagctc 
caggcgagagtcctggctatagaaagatacctaaaggatcaacagctcctagggatttgg 
ggctgctctggaagactcatctgcaccactgctgtgccttggaactccagttggagtaat 
aaatctgaaaaagatatttgggataacatgacttggatgcagtgggatagagaaattagt 
aattacacaggcttaatatacaatttgcttgaagactcgcaaaaccagcaggaaaagaat 
gaaaaagatttattagaattggacaagtggaacaatctgtggaattggtttgacatatca 
aactggccgtggtatataaaaatattcataatgatagtaggaggcttgataggtttaaga 
ataatttttgctgtgctttctatagtgaatagagttaggcagggatactcacctttgtca 
tttcagacccttaccccaagcccgaggggactcgacaggctcggaggaatcgaagaagaa 
ggtggagagcaagacagagacagatccatacgattggtgagcggattcttgtcgcttgcc 
tgggacgatctgcggaacctgtgcctcttcagctaccaccgcttgagagacttcatatta 
attgcagtgagggcagtggaacttctgggacacagcagtctcaggggactacagaggggg 
tgggaaatccttaagtatctgggaagtcttgtgcaatattggggtctagagctagiaaaag 
agtgctattagtctgcttgataccatagcaataacagtagctgaaggaacagataggatt 
atagaattagtacaaagaatttgtagagctatcctcaacatacctagaagaataagacag 
ggc 1 1 1 gaagcagc 1 1 1 gc ta taa 
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FIGURE 59 (SEQ ID NO:62) 

atgagagtgatggggacacagaagaattgtcaacaatggtggatatggggcatcttaggc 

ttctggatgctaatgatttgtaacacggaggacttgtgggtcacagtctactatggggta 

cctgtgtggagagaagcaaaaactactctattctgtgcatcagatgctaaagcatatgag 

acagaagtgcataatgtctgggctacacatgcttgtgtacccacagaccccaacccacaa 

gaaatagttttgggaaatgtaacagaaaattttaatatgtggaaaaataacatggcagat 

cagatgcatgaggatataatcagtttatgggatcaaagcctaaagccatgtgtaaagttg 

accccactctgtgtcactttaaactgtacagatacaaatgttacaggtaatagaactgtt 

acaggtaatacaaatgataccaatattgcaaatgctacatataagtatgaagaaatgaaa 

aattgctctttcaatgcaaccacagaattaagagataagaaacataaagagtatgcactc 

ttttataaacttgatatagtaccacttaatgaaaatagtaacaactttacatatagatta 

ataaattgcaatacctcaaccataacacaagcctgtccaaaggtctcttttgacccgatt 

cctatacattactgtgctccagctgattatgcgattctaaagtgtaataataagacattc 

aatgggacaggaccatgttataatgtcagcacagtacaatgtacacatggaattaagcca 

gtggtatcaactcaactactgttaaatggtagtctagcagaagaagggataataattaga 

tctgaaaatttgacagagaataccaaaacaataatagtacatcttaatgaatctgtagag 

attaattgtacaaggcccaacaataatacaaggaaaagtgtaaggataggaccaggacaa 

gcattctatgcaacaaatgacgtaataggaaacataagacaagcacattgtaacattagt 

acagatagatggaataaaactttacaacaggtaatgaaaaaattaggagagcatttccct 

aataaaacaataaaatttgaaccacatgcaggaggggatctagaaattacaatgcatagc 

tttaattgtagaggagaatttttctattgcaatacatcaaacctgtttaatagtacatac 

taccctaagaatggtacatacaaatacaatggtaattcaagcttacccatcacactccaa 

tgcaaaataaaacaaattgtacgcatgtggcaaggggtaggacaagcaatgtatgcccct 

cccattgcaggaaacataacatgtagatcaaacatcacaggaatactattgacacgtgat 

gggggatttaacaacacaaacaacgacacagaggagacattcagacctggaggaggagat 

atgagggataactggagaagtgaattatataaatataaagtggtagaaattaagccattg 

ggaatagcacccactaaggcaaaaagaagagtggtgcagagaaaaaaaagagcagtggga 

ataggagctgtgttccttgggttcttgggagcagcaggaagcactatgggcgcagcgtca 

ataacgctgacggtacaggccagacaactgttgtctggtatagtgcaacagcaaagcaat 

ttgctgaaggctatagaggcgcaacagcatatgttgcaactcacagtctggggcattaag 

cagctccaggcgagagtcctggctatagaaagatacctaaaggatcaacagctcctaggg 

atttggggctgctctggaagactcatctgcaccactgctgtgccttggaactccagttgg 

agtaataaatctgaagcagatatttgggataacatgacttggatgcagtgggatagagaa 

attaataattacacagaaacaatattcaggttgcttgaagactcgcaaaaccagcaggaa 

aagaatgaaaaagatttattagaattggacaagtggaataatctgtggaattggtttgac 

atatcaaactggctgtggtatataaaaatattcataatgatagtaggaggcttgataggt 

ttaagaataatttttgctgtgctctctatagtgaatagagttaggcagggatactcacct 

ttgtcatttcagacccttaccccaagcccgaggggactcgacaggctcggaggaatcgaa 

gaagaaggtggagagcaagacagagacagatccatacgattggtgagcggattcttgtcg 

cttgcctgggacgatctgcggagcctgtgcctcttcagctaccaccgcttgagagacttc 

atattaattgcagtgagggcagtggaacttctgggacacagcagtctcaggggactacag 

agggggtgggagatccttaagtatctgggaagtcttgtgcagtattggggtctagagcta 

aaaaagagtgctattagtccgcttgataccatagcaatagcagtagctgaaggaacagat 

aggattatagaattggtacaaagaatttgtagagctatcctcaacatacctaggagaata 

agacagggctttgaagcagctttgctataa 
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FIGURE 60 (SEQ ID NO:63) 

atgagagcgagggggatactgaagaattatcgacactggtggatatggggcatcttaggc 
ttttggatgctaatgatgtgtaatgtgaagggcttgtgggtcacagtctactacggggta 
cctgtggggagagaagcaaaaactactctattttgtgcatcagatgctaaagcatatgag 
aaagaagtgcataatgtctgggctacacatgcctgtgtacccacagaccccaacccacaa 
gaagtgattttgggcaatgtaacagaaaattttaacatgtggaaaaatgacatggtggat 
cagatgcaggaagatataatcagtttatgggatcaaagccttaagccatgtgtaaaattg 
accccactctgtgtcactttaaactgtacaaatgcaactgttaactacaataatacctct 
aaagacatgaaaaattgctctttctatgtaaccacagaattaagagataagaaaaagaaa 
gaaaatgcacttttttatagacttgatatagtaccacttaataataggaagaatgggaat 
attaacaactatagattaataaattgtaatacctcagccataacacaagcctgtccaaaa 
gtctcgtttgacccaattcctatacattattgtgctccagctggttatgcgcctctaaaa 
tgtaataataagaaattcaatggaataggaccatgcgataatgtcagcacagtacaatgt 
acacatggaattaagccagtggtatcaactcaattactgttaaatggtagcctagcagaa 
gaagagataataattagatctgaaaatctgacaaacaatgtcaaaacaataatagtacat 
cttaatgaatctatagagattaaatgtacaagacctggcaataatacaagaaagagtgtg 
agaataggaccaggacaagcattctatgcaacaggagacataataggagatataagacaa 
gcacattgtaacattagtaaaaatgaatggaatacaactttacaaagggtaagtcaaaaa 
ttacaagaactcttccctaatagtacagggataaaatttgcaccacactcaggaggggac 
ctagaaattactacacatagctttaattgtggaggagaatttttctattgcaatacaaca 
gacctgtttaatagtacatacagtaatggtacatgcactaatggtacatgcatgtctaat 
aatacagagcgcatcacactccaatgcagaataaaacaaattataaacatgtggcaggag 
gtaggacgagcaatgtatgcccctcccattgcaggaaacataacatgtagatcaaatatt 
acaggactactattaacacgtgatggaggagataataatactgaaacagagacattcaga 
cctggaggaggagacatgagggacaattggagaagtgaattatataaatacaaggtggta 
gaaattaaaccattaggagtagcacccactgctgcaaaaaggagagtggtggagagagaa 
aaaagagcagtaggaataggagctgtgttccttgggttcttgggagcagcaggaagcact 
atgggcgcagcatcaataacgctgacggtacaggccagacaattattgtctggtatagtg 
caacagcaaagtaatttgctgagggctatagaggcgcaacagcatatgttgcaactcacg 
gtctggggcattaagcagctccaggcaagagtcctggctatagagagatacctacaggat 
caacagctcctaggactgtggggctgctctggaaaactcatctgcaccactaatgtgctt 
r.ggaactctagttggagtaataaaactcaaagtgatatttgggataacatgacctggatg 
cagtgggatagggaaattagtaattacacaaacacaatatacaggttgcttgaagactcg 
caaagccagcaggaaagaaatgaaaaagatttactagcattggacaggtggaacaatctg 
tggaattggtttagcataacaaattggctgtggtatataaaaatattcataatgatagta 
ggaggcttgataggtttaagaataatttttgctgtgctctctctagtaaatagagttagg 
cagggatactcacccttgtcattgcagacccttatcccaaacccgaggggacccgacagg 
ctcggaggaatcgaagaagaaggtggagagcaagacagcagcagatccattcgattagtg 
agcggattcttgacacttgcctgggacgacctacgaagcctgtgcctcttctgctaccac 
cgattgagagacttcatattaattgtagtgagagcagtggaacttctgggacacagtagt 
ctcaggggactgcagagggggtggggaacccttaagtatttggggagtcttgtgcaatat 
tggggtctagagttaaaaaagagtgctattaatctgcttgatactatagcaatagcagta 
gctgaaggaacagataggattctagaattcatacaaaacctttgtagaggtatccgcaac 
gtacctagaagaataagacagggcttcgaagcagctttgcaataa 
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FIGURE 61 (SEQ ID NO:64) 

atgagagtgagggggatactgaggaattggcaacaatggtggatatggggcatcttaggc 
ttttggatgttaatgatttatagtgtattggggaacttgtgggtcacagtctattatggg 
gtacctgtgtggaaagaagcaaaaactactctattctgtgcatcagatgctaaagcatat 
gagagagaagtgcataatgtctgggctacacatgcctgtgtgcccacagaccccaacccg 
caagaaatggtcttgggaaatgtaacagaaaattttaacatgtggaaaaatgatatggtg 
gatcagatgcatgaggatataatcagtttatgggatcaaagcctaaagccatgtgtaaag 
ttgaccccactctgtgtcactttagagtgtaataacgttaatactaccaatgaaatgaca 
aattgctctttcaatgcaaccacagacgtaagagataagaaacagagagtgtctgcattt 
ttttatagacttgatatagtaccacttaatgagaataacaatgaatcccagaagtataga 
ttaataagttgcaatacctcaaccataacacaagcctgtccaaaggtcacttttgaccca 
attcctatacattactgtactccagctggttatgcgattctaaagtgtaataataagaca 
ttcaatgggacaggaccatgccataatgtcagcacagtacaatgtacacatggaattaag 
ccagtagtatcaactcaactactattgaatggtagcctagcagaagaagagataatcatt 
agacctgaaaatctgacaaacaatgccaaaataataatagtacaccttaatgaatctgta 
gaaattgtgtgtacaagacccaacaataatacaagaaaaagtataaggataggaccggga 
caaacattctatgcaacaaatggcataataggaaacataagacaagcacattgtaacatt 
agtgaagagagatggaacaaaaccttacaacaggtaggaaaaaaattagcagaacacttc 
cctaataaaacaataaagtttgaaccatcctcaggaggggatctagaaattactacacat 
agctttaattgtggaggagaatttttctattgcaatacatcaggcctgtttaatggtaca 
tacaatcacactacagaaggtaattcaaactcaaccatcacactcccatgcagaataaaa 
caaattataaacatgtggcgggaggtaggacgagcaatgtatgctcctcccattgcagga 
aacataacatgtaaatcaaatatcacaggattactattagtgcgtgatggaggagaaagc 
aatgactcagacaacaacatcgagatattcagacctggaggaggagatatgaggaacaat 
cggagaagtgaattatataaatataaagtggtagaaattaagccattgggaatagcaccc 
actggggcaaaaaggagagtggtggagagagaaaaaagagcagtgggactaggagctatg 
ttccttgggttcttgggagcagcaggaagcactatgggcgcggcgtcaataacgctgacg 
gtacaggccagacaactgttgtctggtatagtgcaacagcaaagcaatttgctgaaggct 
atagaggcgcaacagcatatgttgcaactcacggtctggggcattaagcagctccagaca 
agagtcctggctatagaaagatacctaaaggatcaacagctcctagggctttggggctgc 
uctggaaaactcatctgcaccactgctgtgccttggaactccagttggagtaataaatct 
gtaacaga ta 1 1 tggga taacat gacctggatgcagtgggatagggaaat tagtaat tac 
acaaacacaatatacaggttgcttgaagactcgcaaacccagcaggaacaaaatgaaaaa 
gatttattagcactggacagttggaataatttgtggaattggtttaacataacaaagtgg 
ctgtggtacataaaaatattcataatgatggtaggaggcttgataggcttaagaataatt 
tttgctgtgctctctgtagtaaatagagttaggcaggggtattcaccattatcgtttcag 
acccttatcccaagcccgaggggacccgacaggctcggaagaatcgaagaagaaggtgga 
gagcaagacagagacagatccgtgcgattagtgaacggattcttagccattgcctgggac 
gatctacggagcctgtgtcttttcagctaccaccgattgagagacttcatattgattgca 
acgagagcggtggaacttctgggacgcagcagtctcaggggattgcagagggggtgggaa 
gcccttaagtatctaggaagtcttgtgcagtattggggtctggaactaaaaaagagtgct 
gttagtctgcttgataccgtagcaatagtagtagctgaaggaacagataggattatagaa 
ttagtacaaagagtttgcagagctatccgcaacatacctacaagaatcagacagggcttt 
gaaacagctttgctataa 
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FIGURE 62 (SEQ ID NO:65) 



atgagagtgagggagataccgaggaattggcaacaatggtggatatggggaatcttaggc 
ttttggatggtaatgatttgtaatgtgatggggaacttgtgggtcacagtctattatggg 
gtacctgtgtggaaagaagcaaaaactactctattctgtgcatcagatgctaaagcatat 
gagaacgaagtgcataatgtctgggctacacatgcctgtgtacccacagaccccaaccca 
caagaaatagttttggaaaatgtaacagaaaattttaacatgtggaaaaatgacatggtg 
gatcagatgcatgaggatataatcagtttatgggatcaaagcctacagccatgtgtaaag 
ttgaccccactctgtgtcactttaaattgtacaacggttaccaacagtaccgtcaataac 
acgcgtggagagatgcgaaattgctctttcaatatgaccacagaagtaagagataagaaa 
cagcaagtgtatgcacttttttataaacttgatgtagtaccacttaatgaaaataatagt 
gactctagcaactttagtgagtatagattaataaattgtaatacctcagccatgacacaa 
gcctgtccaaaggtcacttttgacccaattcctatacattattgtgctccagctggttat 
gcgattctaaagtgtaataataagacatttaatgggacaggaccatgcagtaatgtcagc 
acagtacaatgtacacatggaattaagccagtggtatcaactcaactcctgttaaatggt 
agcctagcagaaaaagaaataataattagatccgaaaatctgacaaacaatgtcaaaaca 
ataatagtacatcttaatgaatccatagaaattaggtgtacaagacccaacaataataca 
agaaaaagtataaggataggaccaggacaaacattctatgcaacaggagaaataatagga 
gacataagacaagcacactgtaccattagtagtcagaactggaatagaactttacaaagg 
gtaagtgaaaaattaaaagaacacttccctaataaaacaataaaatttgaaccatcctca 
ggaggggacctagaaataacaacacatagctttaattgtagaggagaatttttttattgc 
aatacatcaggcctatttaatagaacatttaatagtacatacatgcataatagtacaaac 
aatgactcaatcatcacaatcccatgcagaataaaacaaattataaacatgtggcaggag 
gtaggaagagcaatgtatgcccctcccgttgcaggaaacataacatgtaaatcaaatatc 
acaggactactattggtacgggatggaggcgaaaatggcacaaataacacagaggtattc 
agacctggaggaggaaatatgagggacaattggagaagtgagttatataaatataaagtg 
gtagaaattaaaccattgggagtagcacccaataaggcaaaaaggagagtggtggagaga 
gaaaaaagagcagtgggaataggagctgtgttccttgggttcttgggagcagcaggaagc 
actatgggcgcggcgtcaatagcgctgacggcacaagccagacaagtattgtctggtata 
gtgcaacagcaaagcaatttgctgaaggctatagaggcgcagcagcatctgttgcaactc 
acagtctggggcattaagcagctccagacaagagtcctggctatagaaagatacctaaag 
gatcaacagctcctagggatttggggctgctctggaaaaatcatctgccccactgctgtg 
ccttggaactccagttggagtaataaatctcaagaagatatttggggaaacatgacctgg 
atgcagtgggatagagaaattagtagttacacaaacacaatatacaatttgcttgaagaa 
tcgcaaagacagcaggagaaaaatgaaaaggatttattagaattggacagttggaacttt 
ttgtggagttggtttgacataacaaagtggctgtggtatataaaaatattcataataata 
gtaggaggcttgataggtttaagaataatttttgctgtgctctctatagtgaatagagtt 
aggcagggatactcacctttgtcgttccagacccttaccccgagcccagggggacccgac 
aggctcggaagaatcgaagaagaaggtggagagcaagacagagacagatccgtgagatta 
gtgaacggattcttagcacttgcctgggacgacctgcggagcctgtgccttttcagctac 
caccgattgagagacttcatattggtgacagcgagagcggtggaacttctgggacgcagc 
agtctcaggggactacagagggggtgggaagctcttaagtatctgggaagccttgtgcaa 
tattggggtctggagctaaaaaagagtgctactagcctgcttgataccatagcaataaca 
gtagctgaaggaacagataggattatagaaatagtacaaagattctgtagagctatcctc 
catatacctagaagaataagacagggctttgaagcagctttgctataa 
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FIGURE 63 (SEQ ID NO:66) 



gtcgacaagagcagaagacagtggcaatgagagtgacggggatactgaggaattacccac 
aatggtggatatgggtcatcttaggcttttagataatatataatgtgggagggatgtggg 
tcacagtctattatggggtacctgtgtggaaggaggcaaaaactactctattttgtgcat 
cagatgctaaagcatatgataaagaagtgcataatgtctgggccacacatgcctgtgtac 
ccacagatcccaacccacaagaattggttttggaaaatgtaacagaaaattttaatatgt 
ggaaaaatgacatggtggatcagatgcatgaagacataatcagtttatgggatgaaagcc 
taaaaccatgtgtaaagttgaccccactctgtgtcactttaaattgtaaggcaaatgtta 
ctgttaatactacgaactttaatgatagcatgattgaacaaatgagaaattgctctttca 
atataaccacagaactaagagataagaaaaagcaagtgtatgcacttttttataagcttg 
atataatacaacttgataatgacaactctagtgacaactctggttatagattaataaatt 
gtaatacctcagccataacacaagcctgtccaaaggtcacttttgacccaattcctatac 
attattgtgctccagctggatatgcgattctaaagtgtaataataagacattcaatggaa 
caggaccatgcagtaatgttagcacagtacaatgtacacatggaattaagccagtggtat 
caactcaactactgttaaatggtagcctagcagaaggagatataataattagatctcaaa 
acctgacaaacaatgccaaaataataatagtacatcttaatgaatctgtagaaattgtgt 
gtacaagacccggcaataatacaagacaaagtataaggataggaccaggacaaacattct 
atgcaacaggagacataataggagacataaggcaagcacattgtaacattagtgcaggga 
aatggaatgaaactttaaaaagggtaeigtaaaaaattaggagaacactttcctaataaaa 
caataaaatttgcaccacactcaggaggggacctagaaattacaatgcatagttttaatt 
gtagaggagaatttttttattgtaatacatcaagtctgtttaatagtagttataatacat 
cagacctgtttaatagtaataatggttcagccatcacactcccatgcagaataaaacaaa 
ttgtaaacatgtggcagggggtaggacgagcaatatatgcccctcccattgcaggaaaca 
taacatgtaactcaagtatcacaggactactcttggtacgtgatggaggaaacacaacca 
actcaactgagatattcagaccagaaggaggaaatatgagggacaattggagaagtgaat 
catataaatacaaagtggtagaaattaagcccttgggaatagcgcccactaatgcaaaaa 
ggagagtggtggagagagaaaaaagagcagtgacactaggagctatgttccttgggttct 
tgggagcagcaggaagcactatgggcgcagcgtcaataacgctgacggcacaggccagac 
agttgttgtctggaatagtgcaacagcaaagcaatttgctgagagctatagagacgcaac 
agcatatgttgcaactcacagtttggggcattaaacagctccaagcaagagtcttggcta 
tagaaagatacctaaaggatcaacagctcctaggaatttggggctgctctggaaaactca 
tctgcaccactgctgtgccttggaactccagttggagtaataaaactgagaaagatattt 
gggaaaacatgacctggatgcagtgggatagagaaattagtaattacacagacataatat 
acaacttacttgaagtctcgcaaatccagcaggaacagaataaaaaagatttattagcat 
tggacagttggaaaattctgtggagttggtttgacatatcaagttggctgtggtacataa 
gaatattcataatgatagtaggaggcttgataggcttgagaataatttctgctgtgcttt 
ctatagtgaatagagttaggcagggatactcacctttgtcgtttcagacccttgccccga 
acccaagggaactcgacaggctcggaagaatcgaagaagaaggtggagagcaagacagag 
acagatcgattcgattagtacaaggattcttagcacttgcctgggacgacttgaggagcc 
tgtgccttttcagctaccaccgattgagagacttcatattgattgcagcgaaagcagcgg 
aacttctgggacacaacagtctcaggggactacagagggggtgggaaatccttaagtatc 
tgggaagtcttgctcaatattggggtctagaactcaaaaagagtgctattagtttgcttg 
ataccatagcaatagcagtagctgaaggaacagataggattatagaattaatacaaagaa 
tttggagagctatccgcaacacacctagaagaataagacagggctttgaagcagctttgc 
aa t aac t c t agaaagaaac aagggcgaa 1 1 c 



79/158 



wo 03/004620 



PCT/US02/21420 



FIGURE 64 (SEQ ID NO:67) 



gtcgacaagagcagaagacagtggcaatgagagtgagggggatactgaggaattatccac 
aatggtggatatgggtcatcttaggcttttggataatatataatgtgggagggaacatgt 
gggtcacagtctattatggggtacctgtgtggaaagatgcaaaaactactctattttgtg 
catcagatgctaaagcatatgataaagaagtgcataatgtctgggccacacatgcctgtg 
tacccacagatcccaacccacaagaattagttttggaaaatgtaacagaaaattttaaca 
tgtggaaaaatgacatggtggatcagatgcatgaagacataatcagtttatgggatgaaa 
gcctaaaaccatgtgtaaagttgaccccactctgtgtcacttbaaattgtacagataatg 
ttactgttaatactacgagccttactgttagccctactgttaacataactgaacaaataa 
gaaattgctctttcaatataaccacagaactaagggataagaaaaagcaagtgtatgcac 
ttttttataggcttgacatagtacaatttgataatgacaactctagttataggttaataa 
attgtaatacctcagccataacacaagcctgtccaaaggtcacttttgacccaattccta 
tacattattgtgctccagctggatatgcgattctaaagtgtaataataagacattcaatg 
gaacaggaccatgcagtaatgtcagcacagtacaatgtacacatggaattaagccagtgg 
tatcaactcaactactgttaaatggtagcctagcagaaggagatataataattagatctc 
aaaacctgacaaacaatgccaaaataataatagtacatcttaatgaatctgtagaaattg 
tgtgtacaagacccggcaataatacaagacaaagtataaggataggaccaggacaaacat 
tctatgcaacaggagacataataggagacataaggcaagcacattgtaacattagtgcag 
ggaaatggaatgaaactttaaaaagggtaagtaaaaaattaggagaacactttcctaata 
aaacaataaaatttgcaccacactcaggaggggacctagaaattacaatgcatagtttta 
attgtagaggagaatttttttattgtaatacatcaagtctgtttaatagtagttataata 
catcaggcctgtttaatagtaataatggttcaaccatcacactcccatgcagaataaaac 
aaattgtaaacatgtggcagggggtaggacgagcaatatatgcccctcccattgcaggaa 
acataacatgtaactcaagtatcacaggactactcttggtacgtgatggaggaaacacaa 
ccaactcaaccgagacattcagaccagaaggaggaaatatgagggacaattggagaagtg 
aattatataaatataaagtggtagaaattaagcccttgggaatagcgcccactaatgcaa 
aaaggagagtggtggagagagaaaaaagagcagtgacactaggagctatgttccttgggt 
net tgggagcagcaggaagcactatgggcgcagcgtcaatagcgctgacggcacaggcca 
gacggttgttgtctggaatagtgcaacagcaaagtaatttgctgaaagctatagaggcgc 
aacagcatatgttgcaactcacagtttggggcattaaacagctccaagcaagagtcttgg 
ctatagaaagatacctaaaggatcaacagctcctaggaatttggggctgctctggaaaac 
tcatctgcaccactgctgtgccttggaactccagttggagtgataaaactgagaaagata 
tttgggaaaacatgacctggatgcagtgggatagagaaattagtaattacacagacataa 
tatacaatttacttgaagtctcgcaaatccagcaggaacagaatgaaaaagatttattgg 
cattggacagttggaaaagtctgtggaattggtttgacatatcaaaatggctgtggtaca 
taaaaatattcataatgatagtaggaggcttgataggcttgagaataatttttgctgtgc 
tttctatagtgaatagagttaggcagggatactcaccttt'gtcatttcagacccttatcc 
cgaacccaagggaactcgacaggctcggaagaatcgaagaagaaggtggagagcaagaca 
gagacagatcgattcgattagtacaaggattcttagcacttgcctgggacgacttgagga 
gcctgtgccttttcagctaccaccgattgagaaacttcatattgattgctgcaagagcag 
cggaacttctgggacacagcagtctcaggggactacagagggggtgggaaatccttaagt 
atctgggaagtcttgcacaatattggggtctagaactcaaaaggagtgctattagtctgc 
ttgacatcacagcaattgcagtagctgaaggaacagataggattatagaattaatacaaa 
gaatttggagagctatccgcaacatacctacaaggataagacagggctttgaagcagctt 
tgcaataactctagaaagaaacaagggcgaattc 
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atgagagtgacggggatactgaggaattatccacaatggtggatatgggtcatcttaggc 
ttttggataatatataatgtgggagggaacatgtgggtcacagtctattatggggtacct 
gtgtggaaagaggcaaaaactactctattttgtgcatcagatgctaaagcatatgataaa 
gaagtgcataatgtctgggccacacatgcctgtgtacccacagatcccaacccacaagat 
ttggttttggaaaatgtaacagaaaattttaatatgtggaaaaatgacatggtggatcag 
atgcatgaagacataatcagtttatgggatgaaagcctaaaaccatgtgtaaagttgacc 
ccactctgtgtcactttaaattgtaaagcaaatgttactgttaaaactaatgcaaatgtt 
actgttaatactacgaactttaatgatagcatgattgaacaaatgaggaattgctctttc 
aatataaccacagaactaagagataagaaaaagcaagtgtatgcacttttttataggctt 
gatatagtacaatttgacaatgacaactctagttataggttaataaattgtaatacctca 
gccataacacaagcctgtccaaaggtcacttttgacccaattcctatacattattgtgct 
ccagctggatatgcgattctaaagtgtaataataagacattcaatggaacaggaccatgc 
agtaatgttggcacagtacaatgtacacatggaattaagccagtggtatcaactcaacta 
ctgttaaatggtagcctagcagaaggagatataataattagatctcaaaacctgacaaac 
aatgccaaaataataatagtacatcttaatgaatctgtagaaattgtgtgtacaagaccc 
ggcaataatacaagacaaagtataaggataggaccaggacaaacattctatgcaacagga 
gacataataggagacataaggcaagcacattgtaacattagtgcagggaaatggaatgaa 
actttaaaaagggtaagtaaaaaattaggagaacactttcctaataaaacaataaaattt 
gcaccacactcaggaggggacctagaaattacaatgcatagttttaattgtagaggagaa 
i:t:tttttattgtaatacatcaagtctgtttaatagtagttataatacatcaggcctgttt 
aacagtaataatggttcaaccatcacactcccatgcagaataaaacaaattgtaaacatg 
tggcagggggtaggacgagcaatatatgcccctcccattgcaggaaacataacatgtaac 
tcaagtatcacaggactactcttggtacgtgatggaggaaacataaccaactcaaccgag 
atattcagaccagaaggaggaaatatgagggacaattggagaagtgaattatataaatat 
aaagtggtagaaattaagccattgggaatagcgcccactaatgcaaaaaggagagtggtg 
gagagagaaaaaagagcagtgacactaggagctatgttccttgggttcttgggagcagca 
ggaagcactatgggcgcagcgtcaataacgctgacggcacaggccagacagttgttgtct 
ggaatagtgcaacagcaaagcaatttggtgagagctatagaggcgcaacagcatatgctg 
caactcacagtctggggcattaagcagctccaagcaagagtcttggctatagaaagatac 
ctaaaggatcagcagctcctaggaatttggggctgctctggaaaactcatctgcaccact 
gctgtgccttggaactccagttggagtagtaaaactgagaaagatatttgggaaaatatg 
acctggatgcagtgggatagagaaattagtaattacacagacataatatacaacctactt 
gaagtctcgcaaatccagcaggaacagaatgaaaaagatttattagcattggacagttgg 
aaaaatctgtggaattggtttgacatatcaaaatggctgtggtacataaaaatattcata 
ar.gar.agtaggaggcttgataggcttgaggataatttttgctgtgctttctatagtgaat 
agagt taggcagggatactcacctttgtcgtttcagacccttatcccgaacccaagggaa 
ctcgacaggctcggaagaatcgaagaagaaggtggagagcaagacagagacagatcgatt 
cgattagtacgaggattcttagcacttgcctgggacgacttgaggagcctgtgccttttc 
agctaccaccgattgagagacttcatattgattgcagcgagagcagcggaacttctggga 
catagcagtctcaggggactacagagggggtgggaaatccttaagtatctgggaagtctt 
gcacaatattggggtctagaactcaaaaagagtgctattagtctgcttgacatcacagca 
attgcagtagctgaaggaacagatagaattatagaattaatacaaagaatttggagagct 
atccgcaatatacctacaagaataagacagggctttgaaacagctttgctataa 
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atgagagtgagggggatactgaggaa.ttatcaacaatggtggatatgggccagcttaggc 
ttttggatgttaatgagttataatgtggtggggaacttgtgggtcacagtctattacggg 
gtacctgtgtggaaagaagcaaaaactactctattctgtgcatcagatgctaaaggatat 
gaaaaagaagtgcataatgtctgggctacacatgcctgtgtacccacagaccccaaccca 
caagaactggttgtggaaaatgtaacagaaaattttaacatgtggaaaaatgacatggta 
gatcagatgcatgaggatataatcagtttatgggaccaaagcctaaagccatgtgtaaag 
ttgaccccactctgtgtcactttaagatgtgtaaatgttaatgctaccagtaatgctacc 
agtagtagtagtgctacctctgataatcccatgaatggagaaataaaaaattgctctttc 
aatgtaaccacagaaataagggataggaaaaaggaagtgtatgcacttttttataaacct 
gatgtagtatcacttgacaactctagtacatatagattaataaattgtaatacttcaacc 
ctaacacaagcctgtccaaaagtcacttttgatccaattcctatacattattgtgctcca 
gctggttatgcgattctaaagtgtaataataagacattcaatgggacaggaccatgcact 
aatgtcagcacagtacaatgtacacatggaattaagccagtagtatcaactcaattactg 
Utaaatggtagcctagcagaaaaagagataataattaaatctaaaaatctgacaaacaat 
gcccaaacaataatagtacatcttaacgaatctatagaaattaggtgtccaagacccaac 
cataatacaagacgaagtataaggataggaccaggacaagcattctatgcaacaggagac 
ataataggagatataagacaagcacactgtaacattagcgaaagtaaatggaataaaact 
ttacaaagggtaagtaaaaaattaggagaacacttccctaataaaacaataaaatttgca 
ccacattcaggaggggacctagaaattacaacacatagctttaattgtagaggggaattt 
ttctattgcaatacatcaaaactgtttaatagtacatacatgcctaatgttacagaaagt 
aatggtacagaaagtaatgtaacgatgatcacactcccatgcagaataaagcaaattata 
aacatgtggcaggaggtaggacgagcaatgtatgcccctcccattgcaggcaacataaca 
tgtacatcaaacatcacaggactactattggtacgtgatggaggcacagaggataatacc 
acagagatattcagacctggaggaggagatatgagagataattggagaaatgaactatac 
aaatataaagtggtagaaattaagccattgggaatagcacccactacagcaaaaaggaga 
gtggcggagagagaaaaaagagcagcaggactaggagctgtactccttggattcttggga 
gcagcaggaagcactatgggcgcggcgtcaataacgctgacggtacaggccagacaattg 
titgtctggtatagtgcaacagcaaagcaatttgctgaaagctatagaggcgcaacagcat 
gLgttgcagctcacggtctggggcattaagcagctccagacaagagtcctggctatagaa 
agatacctaaaggatcaacagctcctaggaatttggggctgctctggaaaactcatctgc 
accactgctgtgccttggaactccagttggagtaatagatctcaaacagatatttggaat 
aacatgacctggatgcagtgggatagagaaattagtaattacacagacacaatatacaag 
ttgcttgaagaatcgcaaaaccagcaggaaaataatgaaaaggatttattagcattgaac 
agctggcaaaatctgtggagttggtttaacataacaaactggctgtggtatataagaatc 
tttataatgatagtaggaggcttgataggtttaaggataatttttgctgtgatctctata 
gtgaatagagttaggcagggatactcacctttgttgtctcagacccttaccccaaacccg 
aggggacccgacaggctcggaagaatcgaagaagaaggtggagagcaagacaaagacaga 
tccattcgattagtgagcggattcttgtcacttgcctgggacgatctgcggagcctgtgc 
ctcttcagctaccaccgattgagagacttaatattgattgtagtgagagcggtggaactt 
ctgggacgcagcagtctcagggggctgcagagggggtgggaagcccttaagtatctggga 
ggccttgtatagtattggggtctggaactaaaaaagagtgctattagtctgtttgatacc 
atagcaatagcagtagctgaaggaacagataggattatagaattagtacaaggaatttgt 
agagctatcctcaacatacctagaagaataagacagggctttgaagcagctttgcaataa 
aatgggtggcaagtggtcaaaaagaatcgaattcccgcggccgccatgcggccgggagca 
tgcgacgtcgggccca 
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atgagagtgagggggatactgaggaattatcaacaatggtggatatgggccagcttaggc 
ttttggatgttaatgagttataatgtggtggggaacttgtgggtcacagtctattacggg 
gtacctgtgtggaaagaagcaaaaactactctattctgtgcatcagatgctaaaggatat 
gaaaaagaagtgcataatgtctgggctacacatgcctgtgtacccacagaccccaaccca 
caagaactggttgtggaaaatgtaacagaaaattttaacatgtggaaaaatgacatggta 
gatcagatgcatgaggatataatcagtttatgggaccaaagcctaaagccatgtgtaaag 
ttgaccccactctgtgtcactttaagatgtgtaaatgttaatgctaccagtaatgctacc 
agtagtagtagtgctacctctgataatcccatgaatggagaaataaaaaattgctctttc 
aatgtaaccacagaaataagggataggaaaaaggaagtgtatgcacttttttataaacct 
gatgtagtatcacttgacaactctagtacatatagattaataaattgtaatacttcaacc 
ctaacacaagcctgtccaaaagtcacttttgatccaattcctatacattattgtgctcca 
gctggttatgcgattctaaagtgtaataataagacattcaatgggacaggaccatgcact 
aatgtcagcacagtacaatgtacacatggaattaagccagtagtatcaactcaattactg 
ttaaatggtagcctagcagaaaaagagataataattaaatctaaaaatctgacaaacaat 
gcccaaacaataatagtacatcttaacgaatctatagaaattaggtgtccaagacccaac 
cataatacaagacgaagtataaggataggaccaggacaagcattctatgcaacaggagac 
ataataggagatataagacaagcacactgtaacattagcgaaagtaaatggaataaaact 
ttacaaagggtaagtaaaaaattaggagaacacttccctaataaaacaataaaatttgca 
ccacattcaggaggggacctagaaattacaacacatagctttaattgtagaggggaattt 
ttctattgcaatacatcaaaactgtttaatagtacatacatgcctaatgttacagaaagt 
aatggtacagaaagtaatgtaacgatgatcacactcccatgcagaataaagcaaattata 
aacatgtggcaggaggtaggacgagcaatgtatgcccctcccattgcaggcaacataaca 
tgtacatcaaacatcacaggactactattggtacgtgatggaggcacagaggataatacc 
acagagatattcagacctggaggaggagatatgagagataattggagaaatgaactatac 
aaatataaagtggtagaaattaagccattgggaatagcacccactacagcaaaaaggaga 
gtggcggagagagaaaaaagagcagcaggactaggagctgtactccttggattcttggga 
gcagcaggaagcactatgggcgcggcgtcaataacgctgacggtacaggccagacaattg 
ttgtctggtatagtgcaacagcaaagdaatttgctgaaagctatagaggcgcaacagcat 
gtgttgcagctcacggtctggggcattaagcagctccagacaagagtcctggctatagaa 
agatacctaaaggatcaacagctcctaggaatttggggctgctctggaaaactcatctgc 
accactgctgtgccttggaactccagttggagtaatagatctcaaacagatatttggaat 
aacatgacctggatgcagtgggatagagaaattagtaattacacagacacaatatacaag 
ttgcttgaagaatcgcaaaaccagcaggaaaataatgaaaaggatttattagcattgaac 
agctggcaaaatctgtggagttggtttaacataacaaactggctgtggtatataagaatc 
tttataatgatagtaggaggcttgataggtttaaggataatttttgctgtgatctctata 
gtgaatagagttaggcagggatactcacctttgttgtctcagacccttaccccaaacccg 
aggggacccgacaggctcggaagaatcgaagaagaaggtggagagcaagacaaagacaga 
tccattcgattagtgagcggattcttgtcacttgcctgggacgatctgcggagcctgtgc 
cf.cttcagctaccaccgattgagagacttaatattgattgtagtgagagcggtggaactt 
ctgggacgcagcagtctcagggggctgcagagggggtgggaagcccttaagtatctggga 
ggccttgtatag 
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gtcgacaagagcagaagacagtggcaatgagagtgatggggatactgaggaattgtccac 
aatggtggatatggggcatcttaagcttttggatgttaatgatttgtaatgtaggaggga 
aattgtgggtcacagtctattatggggtacctgtgtggaaagaagcaaaaactactctat 
tctgtgcatctgatgctaaagcatatgagagggaggtgcataatgtttgggctacacatg 
cctgtgtacccacagaccccaacccacaagaaatagtattggaaaatgtaacagaaaatt 
ttaacatgtggaaaaatgacatggtggatcagatgcatgaggatataattagtttatggg 
atcaaagcctaaaaccatgtgtaaagttgaccccactctgtgtcactttaaattgtagtg 
atgttatccccagtaatgttaccaacactacagttacccacaataacatcacggataaag 
aggaaatgagaaattgtacttttaatataaccacagaaataacagataagaaaagcaaag 
agtatgcaattttttatagacttgatgtagtaccacttaatgagaaggataacaaatcta 
ctgagtgtagattaataaattgtaatacctcaactgtaacacaagcctgtccaaaggtct 
cttttgaaccaattcctatacattattgtgctccagctggttatgcgattctaaaatgta 
ataataagacattcaatgggacaggaccatgcaataatgtcagtacaatacaatgtacac 
atggaatcaagccagtggtatcaactcaactactgctaaatggtagcctagcagaaaaag 
agataataattagatctgaaaatctgacagacaatgcaaaaacaataatagtacatctta 
atgaatccatacgcattatgtgtacaagacccaataataatacaagaaaaagtataagaa 
taggaccaggacaaacattctttgcaacaaacgacataataggagacataagacaagcat 
attgtaacattagtaaagatgactggaataaaaccttacaaaggatagctgagaaattag 
gaaaacacttccctaataaaaacataacgtttagaccatcctcaggaggggacctagaaa 
ttacaacacatagctttaattgtagaggggaatttttctattgcaatacatcaagactgt 
ttaatcatacatacctgtttaatggtacaggcgtgcctaataataccacaccttctaatg 
agaccatcatacttccatgcagaataaaacaaattataaacatgtggcaggaggtagggc 
gagcaatgtatgcccctcccattgcaggaaacatcacatgtacatcaaacatcacaggac 
tactattagtacgtgatggaggcaacagtggcaaaaataccacagaagagatattcagac 
c t gggggaggaaat at gaaggacaat tggagaagt gaa t tatataaat ataaagtggt ag 
aaattaagccattaggaatagctcccactgcggcaaaaaggagagtggtggagagagaaa 
aaagagcagtgggaataggggctgtgttccttgggttcttgggagcagcaggaagcacta 
tgggcgcggcgtcaataacgctgacggtacaggccagacaattgttgtctggtatagtgc 
aacagcaaagcaatttgctgagggctatagaggcgcaacagcatctgttgcaactcacag 
tctggggcattaagcagctccagacaagagtcctggctatggaaagatacctacgggatc 
aacagctcctaggaatttggggctgctctggaaaactcatctgcaccactaatgtgcctt 
ggaacgccagttggagtaataaatctctaggagatatttgggataacatgacctggatgc 
aatgggatagagaaattaataattacacaaacacaatatacaggttgcttgaagaatcgc 
aaacccagcaggagcaaaatgaaaaagatttattagcattggacaaatggcaaaatctgt 
ggagttggtttaacataacaaattggctgtggtatataaaaatattcataatgatagtag 
gaggtttgataggtttaagaataatttttgctgtgctatctatagtaaatagagttaggc 
agggatactcacctttgtcgtttcagacccttatcccagacccgaggggaccagacaggc 
tcagaagaatcgaagaagaaggtggagagcaagacaaagacagatccgtgcgattagtga 
gcggattcttagcacttgcctgggacgacctgcggagcctgtgccttttcagctaccacc 
tattgagagactttatattgggagtagcgagagtggtggaacttctgggacgcagcagtc 
tcaggaaactacagagggggtgggaagcccttaagtatctgggaagtcttgtgcagtatt 
ggggtctggaactagaaaagagtgctattagtctgcttgataccatagcaataacagtag 
ctggggggacagataggattatagaattcctacaacgaatttgtagagctatacgcaacc 
tacctagaagaataagacatggctttgaagcagctttgcaataactctagaaagaaacaa 
gggcgaattc 
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gtcgacaagagcagacgacagtggcaatgagagtgatgggaatactgaggaattgtccac 
aatggtggatatggggcatcttaagcttttggatgttaatgatttgtaatgtaggaggga 
aattgtgggtcacagtctattatggggtacctgtgtggaaagaagcaaaaactactctat 
tctgtgcatctgatgctaaagcatatgagagggaggtgcataatgtttgggctacacatg 
cctgtgtacccacagaccccaacccacaagaaatagtattggaaaatgtaacagaaaatt 
ttaacatgtggaaaaatgacatggtggatcagatgcatgaggatataattagtttatggg 
atcaaagcctaaaaccatgtgtaaagttgaccccactctgtgtcactttaaattgtagtg 
atgttatccccagtaatgttacagttacccacaataacatcatggataaagaggaaatga 
gaaattgttcttttaatataaccacagaaataacagataagaaaagcaaagagtatgcaa 
ttttttatagacttgatgtagtaccacttaatgagaaggataacaaatctactgagtata 
gattaataaattgtaatacctcaactgtaacacaagcctgtccaaaggtctcttttgaac 
caattcctatacattattgtgctccagctggttatgcgattctaaaatgtaataataaga 
cattcaatgggacaggaccatgcaataatgtcagtacaatacaatgtacacatggaatca 
agccagtggtatcaactcaactactactaaatggtagcatagcagaagaagggataataa 
ttagatctgaaaatctgacagacaatgctaaaacaataatagtacatcttaatgaatcca 
tacgcattgtgtgtacaagacccaataataatacaagaaaaagtataagaataggaccag 
gacaaacattctttgcaacaaacgacataataggagacataagacaagcatattgtaaca 
ttagtaaagatgactggaataaaaccttacaaagggtagctgagaaattaggaaaacact 
tccctaataaaaacataacgtttagaccatcctcaggaggggacctagagattacaacac 
atagctttaattgtagaggagaatttttctattgcaacacatcaagactgtttaatcata 
catacctgtttaatggtacaggcatgcctaatagtaccacaccttctaatgagaccatca 
tacttccatgcagaataaaacaaattataaacatgtggcaggaggtagggcgagcaatgt 
atgcccctcccactgcaggaaacatcacatgtacatcaaacatcacaggactactattag 
tacgtgatggaggcaacagtggcaacaataccacagaagagatattcagacctggaggag 
gaaatatgagggacaattggagaagtgaattatataaatataaagtggtagaaattaagc 
cattaggaatagctcccactgcggcaaaaaggagagtggtggagagagaaa&aagagcag 
tgggaataggagctgtgttccttgggttcttgggagcagcaggaagcactatgggcgcgg 
cgtcaataacgctgacggtacaggccagacaattgttgtctggtatagtgcaacagcaaa 
gcaatttgctgagggccatagaggcgcaacaacatctgttgcaactcacggtctggggca 
ttaagcagctccagacaagagtcctggctatggaaagatacctaaaggatcaacagctcc 
taggaatttggggctgctctggaaaactcatctgcaccactaatgtaccttggaacacca 
gttggagtaataaatctctaagtgatatttgggataacatgacctggatacagtgggata 
gagaaattaataattacacaagcacaatctacaggttgcttgaagaatcgcaaacccagc 
aggaacaaaatgaaaaagatttattagcattggacaaatggcaaaatctgtggagttggt 
ttaacataacaaattggctgtggtatataaaaatattcataatgatagtaggaggcttga 
taggtttaagaataatttttgctgtgctatctatagtaaatagagttaggcagggatact 
cacctttgtcgtttcagacccttatcccagacccgaggggaccagacaggctcagaagaa 
tcgaagaagaaggtggagagcaagacaaagacagatccgtgcgattagtgagcggattct 
tagcacttgcctgggacgacctgcggtgcctgtgccttttcagctaccacctattgagag 
actttatattgggagtagcgagagtggtggaacttctgggacgcagcagtctcaggaaac 
tacagagggggtgggaagcccttaagtatctgggaagtcttgtgcagtattggggtctgg 
aactaaaaaagagtgctattagtctgcttgataccatagcaataacagtagctgggggga 
cagataggattatagaattcctacaacgaatttgtagagctatacgcaacctacctagaa 
gaa taagacagggc 1 1 1 gaagcagc 1 1 tgcaa taactc tagaaagaaacaagggcgaat t 
c 
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atgagagtgatggggatactgaggaattgtcaacaatggtggatgtggggcatcttaggc 

ttttggatgatttgtaatgtggtggggaatttgtgggtcacagtctattatggggtacct 
gtgtggaaagaagcaaaaactactctattctgtgcatcagatgctaaaggatatgagaaa 
gaagtgcataatgtctgggctacacatgcctgtgtacccacagaccccaacccacaagaa 
ttagttttagaaaatgtaacagaaaattttaacatgtggaaaaatgacatggtggatcag 
atgcatgaggatataatcagtttatgggatcaaagcctaaaagccatgtgtaaagttgac 
cccactttgtgtcactttaagttgtacaaatgctactacctacatagcaccataggggac 
gaaataaaaaattgctctttcaatacaaccacagtactaaaagataagacacagaaagtg 
catgcacttttttataaacttgatgtagtaccacttaatgggagtaactctagtgagtat 
agattaataaattgtaatacctcaaccataacacaagcctgtccaaaggtctcttttgac 
ccaattcctatacattattgtgctccagctggttatgcgattctaaagtgtaataacaag 
acattcaatgggacaggaccatgccaaaatgtcagcacagtacaatgtacacatggaatt 
aaaccagtggtatcaacgcaactactgataaatggtagcctagcagaaggagagataatg 
attagatctgaaaatttgacaaacaatgctaaaacaataatagtgcattttaatcaatct 
atagaaattgtgtgtacaagacccaacaataatacaaggaaaagtgtaaggataggacca 
ggacaaacattctatgcaacaggagacataataggagacataagagaagcacattgtaac 
attagcaaagaaaagtggaataacactttacaagaagtaagtaaaaaattaaaggaacac 
taccctaataaaacaataacatttaaaccacactcaggaggggacccagaaattacaaca 
catagctttatttgtagtggagaatttttctattgtaatacatcaggcctgtttaatggt 
acatacatgcccaatggtacagacaagtctaatgatacatcacccatcacactcccatgc 
agaataaaacaaattataaacatgtggcagggggtaggacgagcaatgtatgccccgccc 
attgcaggaaacataacatgtaaatcaaatatcacaggactactattgacacgtgatgga 
ggagaaaataatagaactaatgagacattcagacctggaggaggagatatgagggacaat 
tggagaagtgaattatataaatataaagtggtagaaattaaaccattgggaatagcaccc 
actacCgcaaaaaggagagtggtggagagagaaaaaagagcagtgggaataggagctatg 
ttccttgggttcttgggaatggcaggaagcactatgggcgcggcgtcaataacgctgacg 
gtacaggccagacaattgttgtctggtatagtgcaacagcaaagcaaattgctgagggcc 
atagaggcgcaacagcatatgttgcaactcacggtctggggcattaagcagctccaggca 
agagtcctggctataaaaagatacctaaaggatcaacagctcctaggactgtggggctgc 
tctggaaaactcatctgcaccactgctgtgccttggaactccagttggagtaataataag 
tctcaaacagaaatttgggataacatgacctggatgcagtgggatagagaaattagtaat 
tactcaaacacaatatacaggttgcttgaagaatcgcaaaaccagcaggaaaagaatgaa 
aaggatttattagcattggacagttggaataatctgtggaattggtttagtataacaaag 
tggttgtggtatataagaatattcataataatagtaggaggcttgataggtttaagaata 
atttttgcagtgatctctatagcgaatagagttaggcagggatactcacctctgtcgttg 
cagacccttatcccagacccgaggggacccgacaggcccggaagaatcgaagaagaaggt 
ggagagcaagacagagacagatccataagattagtgagcggattcttagcacttgcctgg 
gacgatctgaggagcctgtgccttttctgctaccaccgattgagagacttcatattgatt 
gcagcgagagtggtggaacttctgggacgcagcagtctcaggggactacagagggggtgg 
gaagcccttaagtatctgggaagtcttgtgcagtattggggtctagagctaaaaaagagt 
gctattagtctgcttgataccatagcaatagcaacagctgaaggaacagataggattata 
gaattaatacaaggaattggtagagctatctacaatatacccagaagaataagacagggc 
tttgaagcagctttgcaataa 
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FIGURE 71 (SEQ ID NO:74) 



gtcgacaagagcagaagacagtggcaatgagagtgatggggagcaggaggaattatcaac 

aatggtggatatggggaatcttaggcttttggatgctaatggttggtaatgtaatgggga 
acttgtgggtcacagtctattatggggtacctgtgtggaaagaagcaaaagctacgctat 
tttgtgcatctgatgcaaaagcatatgagaaagaagtgcataatgtctgggctacacatg 
cctgtgtacccacagaccccgacccacaagaaatagttttggagaatgtaacagaaaatt 
ttaacatgtggaaaaataacatggtggaccagatgcatgaggatataatcagcttatggg 
atcaaagcctaaagccatgtgtaaagttgaccccactttgtgtcactttaaactgtagca 
ataatgttaaaaatgctaccaacagtatgaaggaaatgaaaaattgcactttcaatataa 
ccacagaactaagagataagagaaagcaagaatatgcacttttttataaacttgatatag 
taccacttgaggagaattccagtaagtatagattaataaattgtaatacctcagccataa 
cccaagcctgtccaaaggtctcttttgacccaattcctatacattattgtgctccagctg 
gttatgcgattctaaagtgtaataataagacattcaatggaacaggaccatgcaataatg 
tcagcactgtacagtgtacacatggaatcaagccagtagtatcaactcaactactgttaa 
atggtagtctagcagaagaagaaatagtaattagatctgaaaatatgacaaacaatgcca 
aaataataatagtacatcttaatgaatctgtagaaattacgtgtacaaggcccaacaata 
atacaaggaaaagtatgaggataggaccaggacaaacattctatgcaacaggagacataa 
taggagatataagacaagcacactgtaacattagtgaaaagcaatgggatcagactttat 
acagggtaagtgaaaaattaaaagaacacttccctaataaaacaataaagtttaactcat 
Gctcaggaggggacttagaaattacaacacatagctttaattgtggaggagagtttttct 
•attgcaatacatcagcactgtttaatggcatatacagtaatggcacaaacagtacaaata 
caacagtcatcacactccaatacagaataagacaaattataaacatgtggcagggggtag 
gacgagcaatgtatgcccctcccattgcaggaaacataacatgcagatcaaacatcacag 
gactaatattgacacgtgatggaggtgaagggaatggcacgaatacggatgagatattta 
gacctgcaggaggagatatgagggacaattggagaagtgaattatacaaatataaagtgg 
tagaaattcagccattaggggtagcacccactaaggcaaaaaggagagtggtggagagag 
aaaaaagagcagctttgggagctgtgttccttgggttcttgggagcagcaggaagcacta 
tgggcgcggcatcaataacgctgacggtacaggccagacaactgttgtctggtatagtgc 
aacagcaaagcaatfctgctgagagctgtagaggcgcaacagcatatgttgcaactcacgg 
tctggggcattaagcagctccagacaagagtcctggctatagaaagatacctaaaggatc 
aacagctcctagggatttggggctgctctggaaaactcatctgcaccactgccgtgcctt 
ggaacaatagttggagtaataaatctcaagattatatttggggaaacatgacctggatgc 
aatgggataaagaaattaacaattacacagacacaatatacaggttgcttggggacgcgc 
aaaaccagcaggaggaaaatgaaaaggagttactagaattggacaggtggggaaatctgt 
ggaattggtttgacatgacaagctggctgtggtatataaaaatattcataatggtaatag 
gaggcttgataggtttaagaataatttttgccgtgctttctatagtaaatagagttaggc 
agggatactcacctttgtcatttcagacccttgcccaaaacccgaggggacccgacaggc 
tcggaagaaccgaagaagaaggtggagagcaagacagagacagatccataagattagtga 
gcggattcttagcacttgcctgggaggacttgaggaacctgtgcatcttcctctaccacc 
gattgagggacttcgtattggtgacagcgagagcagtggaacttctgggacgcagcagtc 
tcaggggacttcagagggggtgggaaatccttaagtatttggggagtcttgtgcagtatt 
ggggtctagagctaaaaaagagtgctgttagtctgcttgatagcttagcaatagcagtag 
ctgagggaacagatagaattatagaattcttacaaggaattggtagagctatctacaata 
tacctagaagaataagacagggctttgaagcagctttgcaataactctagaaagaaacaa 
gggcgaattcc 
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gtcgacaagagcagaagacagtggcaatgagagtgatggggagcaggaggaattatcaac 
aatggtggatatggggaatcttaggcttttggatgctaatggttggtaatgtaatgggga 
acttgtgggtcacagtctattatggggtacctgtgtggaaagaagcaaaagctacgctat 
tttgtgcatctgatgcaaaagcatatgagaaagaagtgcataatgtctgggctacacatg 
cctgtgtacccacagaccccgacccacaagaaatagttttggagaatgtaacagaaaatt 
ttaacatgtggaaaaataacatggtggaccagatgcatgaggatataatcagcttatggg 
atcaaagcctaaagccatgtgtaaagttgaccccactttgtgtcactttaaactgtagca 
ataatgttaaaaatgctaccaacagtatgaaggaaatgaaaaattgcactttcaatataa 
ccacagaactaagagataagagaaagcaagaatatgcacttttttataaacttgatatag 
taccacttgaggagaattccagtaagtatagattaataaattgtaatacctcagccataa 
cccaagcctgtccaaaggtctcttttgacccaattcctatacattattgtgctccagctg 
gttatgcgattctaaagtgtaataataagacattcaatggaacaggaccatgcaataatg 
tcagcactgtacagtgtacacatggaatcaagccagtagtatcaactcaactactgttaa 
atggtagtctagcagaagaagaaatagtaattagatctgaaaatatgacaaacaatgcca 
aaataataatagtacatcttaatgaatctgtagaaattacgtgtacaaggcccaacaata 
atacaaggaaaagtatgaggataggaccaggacaaacattctatgcaacaggagacataa 
taggagatataagacaagcacactgtaacattagtgaaaagcaatgggatcagactttat 
acagggtaagtgaaaaattaaaagaacacttccctaataaaacaataaagtttaactcat 
cctcaggaggggacttagaaattacaacacatagctttaattgtggaggagagtttttct 
attgcaatacatcagcactgtttaatggcatatacagtaatggcacaaacagtacaaata 
caacagtcatcacactccaatacagaataagacaaattataaacatgtggcagggggtag 
gacgagcaatgtatgcccctcccattgcaggaaacataacatgcagatcaaacatcacag 
gactaatattgacacgtgatggaggtgaagggaatggcacgaatacggatgagatattta 
gacctgcaggaggagatatgagggacaattggagaagtgaattatacaaatataaagtgg 
tagaaattcagccattaggggtagcacccactaaggcaaaaaggagagtggtggagagag 
aaaaaagagcagctttgggagctgtgttccttgggttcttgggagcagcaggaagcacta 
tgggcgcggcatcaataacgctgacggtacaggccagacaactgttgtctggtatagtgc 
aacagcaaagcaatttgctgagagctgtagaggcgcaacagcatatgttgcaactcacgg 
tctggggcattaagcagctccagacaagagtcctggctatagaaagatacctaaaggatc 
aacagctcctagggatttggggctgctctggaaaactcatctgcaccactgccgtgcctt 
ggaacaatagttggagtaataaatctcaagattatatttggggaaacatgacctggatgc 
aatgggataaagaaattaacaattacacagacacaatatacaggttgcttggggacgcgc 
aaaaccagcaggaggaaaatgaaaaggagttactagaattggacaggtggggaaatctgt 
ggaattggtttgacatgacaagctggctgtggtatataaaaatattcataatggtaatag 
gaggcttgataggtttaagaataatttttgccgtgctttctatagtaaatagagttaggc 
agggatactcacctttgtcatttcagacccttgcccaaaacccgaggggacccgacaggc 
tcggaagaaccgaagaagaaggtggagagcaagacagagacagatccataagattagtga 
gcggattcttagcacttgcctgggaggacttgaggaacctgtgcatcttcctctaccacc 
gattgagggacttcgtattggtgacagcgagagcagtggaacttctgggacgcagcagtc 
tcaggggacttcagagggggtgggaaatccttaagtatttggggagtcttgtgcagtatt 
ggggtctagagctaaaaaagagtgctgttagtctgcttgatagcttagcaatagcagtag 
ctgagggaacagatagaattatagaattcttacaaggaattggtagagctatctacaata 
tacctagaagaataagacagggctttgaagcagctttgcaataactctagaaagaaacaa 
gggcgaattcc 
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FIGURE 73 (SEQ ID NO:76) 

atgaaagtgagggagatacagaggaattggccacaatggtggatatggggcatcttaggc 
ttttggatgataataatttgtagtggggtggggaacttgtgggtcacagtctattatggg 
gtacctgtgtggaaagaagcaacaactactctattctgtgcatcagatgctaaagcatat 
gagaaagaagtgcataatgtctgggctacacatgcctgtgtacccacagaccccgaccca 
caagaaatagttttggaaaatgtaacagaacattttaacatgtggaaaaatgacatggtg 
gatcagatgcatgaggatataatcagtttatgggatcaaagtctaaaaccatgtgtaaag 
ttgaccccactctgtgtcactttaaattgtacaaatgctatcaatacaaatgctaccagt 
acaactactaccagtgcaactgctaccagtacaattgctaccagtacctatgataataat 
ggagaaataaaaaattgctctttcaatacgaccacagaaataagagataagaaacagaac 
acatatgcacttttttatagatctgatatagtaccacttaataataggagtgagtatata 
ttaataaattgtaatacctcaaccataacacaagcctgtccaaaggtctcttttgaccca 
attcctatacattattgtgctcccgctggtttcgcgattctaaagtgtaataataagaca 
ttcaatgggacaggaccatgccaaaatgtcagcacagtacaatgtacacatggaattaaa 
ccagtggtatcaactcaactactgttgaatggtagcctggcagaagaggatataagaatt 
agatctgaaaatctggaaaacaatatcaaaacaatagtagtccaccttaatcaatctgta 
aaaattgtgtgtacaagacccaacaataatacaagaagaagtataaggataggaccagga 
caagcattctatacaaatgacataataggagacataagacaagcacattgtaacattagt 
agagctgagtggaacaacactctagctaaggtaaaggaaaaattagaaaaactctacaat 
aaaacaatagtatttgaaccacactcaggaggggatctagaaattacaacacatagcttt 
aattgtagaggagaattcttctattgcaatacaacaaaactgtttaatataacagaagtg 
cagaggaatgtaaatgatacaaatggcacactcacactcccatgcaggataaaacaattt 
ataaacatgtggcaggaggtaggacgggcaatgtatgcccctcccattgcaggaaacata 
acatgtagatcaaatatcacaggactactattgacacgtgatggaggaaacataacgaac 
gagacagagacatctagacctggaggaggaaatatgaaagacaattggagaagtgaatta 
tataaatataaagtggtagaaattaagccattgggaatagcacccactgaggcaaaaagg 
agagtggtggagagagaaaaaagagcagtgggaataggagctgtgttccttgggttcttg 
ggagcagcaggaagcactatgggcgcggcgtcaataacgctgacggtacaggccagacaa 
ctgttgtctggtatagtgcaacagcaaagcaatttgctgagagctatagaggcgcaacag 
catctgttgcaactcacagtctggggcattaagcagctccaggcaagagtcttggctata 
gaaagatacctaaaggatcaacagctcctagggctttggggctgctctggaaaactcatc 
tgcaccactgctgtgccttggaactccagttggagtaataaatctcaaacagatatttgg 
gacaacatgacctggatgcagtgggatagaaaaattagtaattacacaggcataatatac 
aggttgcttgaggactcgcaaacccagcaggaacaaaatgaaaaagatttattagcattg 
gacagttggaaaaatctgtggacttggtttgacatatcaaagtggttgtggtatataaga 
atattcatcatgatggtaggaggcttgataggtttaagaataattttaggtgtgctctct 
atagtgaaaagagttaggcagggatactcacctttgtcgtttcagacccttatcccaaac 
ccgagggaacccgacaggctcagagggatcgaagaagaaggtggagagcaagacaaagac 
agatcaattcgattagtgagcggattcttagcacttgcctgggacgacctgcggagcctg 
tgcctcttcagc taccaccaattgagagacttcatattgattgtggcgagagcagtggaa 
cttctgggacagagcagtctcaggggactacagagggggtgggaagcccttaagtatctg 
ggaaatcttgtgcagtattggggtctggaactaaaaaagagtgctattagtctgcttgat 
accatagcaatagcagtagctgaaggaacagataggattattgaaataatacagagaatt 
tgtagagctatccgcaacatacctagaagaataagacagggctttgaagcagctttgcta 
taa 
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FIGURE 74 (SEQ ID NO:77) 



atgaaagtgagggagatacagaggaattggccacaatggtggatatggggcatcttaggc 
ctttggatgataataatttgtagtggggtggggaacttgtgggtcacagtctattatggg 
gtacctgtgtggaaagaagcaacaactactctattctgtgcatcagatgctaaagcatat 
gagaaagaagtgcataatgtctgggctacacatgcctgtgtacccacagaccccgaccca 
caagaaatagttttggaaaatgtaacagaacattttaacatgtggaaaaatgacatggtg 
gatcagatgcatgaggatataatcagtttatgggatcaaagtctaaaaccatgtgtaaag 
ttgaccccactctgcgtcactttaaattgtacaaatgctatcaatacaaatgctaccagt 
acaactactaccagtgcaactgctaccagtacaattgctaccagtacctatgataataat 
ggagaaataaaaaattgctctttcaatacgaccacagaaataagagataagaaacagaac 
acatatgcacttttttatagatctgatatagtaccacttaataataggagtgagtatata 
ttaataaattgtaatacctcaaccataacacaagcctgtccaaaggtctcttttgaccca 
attcctatacattattgtgctcccgctggtttcgcgattctaaagtgtaataataagaca 
ttcaatgggacaggaccatgccaaaatgtcagcacagtacaatgtacacatggaattaaa 
ccagtggtatcaactcaactactgttgaatggtagcctagcagaagaggatataagaatt 
agate tgaaagtctggaaaacaatatcaaaacaataatagtccaccttgatcaatctgta 
aaaattgtgtgtacaagacccaacaataatacaagaagaagtataaggataggaccagga 
caagcattctatacaaatgacataataggagacataagacaagcacattgtaacattagt 
agagctgagtggaacaacactctagctaaggtaaaggaaaaattagaaaaactctacaat 
aaaacaatagtacttgaaccacactcaggaggggatctagaaattacaacacatagcttt 
aattgtagaggagaattcttctattgcaatacaacaaaactgtttaatataacagaagtg 
cagaggaatgtaaatgatacaaatggcacactcacactcccatgcaggataaaacaattt 
ataaacatgtggcaggaggtaggacgggcaatgtatgcccctcccattgcaggaaacata 
acatgtagatcaaatatcacaggactactattgacacgtgatggaggaaacataacgaac 
gagacagagacatttagacctggaggaggaaatatgaaagacaattggagaagtgaatta 
tataaatataaagtggtagaaattaggccattgggaatagcacccactgaggcaaaaagg 
agagtggtggagagagaaaaaagagcagtgggaataggagctgtgttccttgggttcttg 
ggagcagcaggaagcactatgggcgcggcgtcaataacgctgacggtacaggccagacaa 
ctgttgtctggtatagtgcaacagcaaagcaatttgctgagagctatagaggcgcaacag 
catctgttgcaactcacagtctggggcattaagcagctccaggcaagagtcttggctata 
gaaagatacctaaaggatcaacagctcctagggctttggggctgctctggaaaactcatc 
tgcaccactgctgtgccttggaactccagttggagtaataaatctcaaacagatatttgg 
gataacatgacctggatgcagtgggatagagaaatcagtaattacacaggcataatatac 
aggttgcttgaagactcgcaaacccagcaggaacaaaatgaaaaagatttattagcattg 
gacagttggaaagatctgtggacttggtttgacatatcaaagtggttgtggtatataaga 
atattcatcatgatagtaggaggcttgataggtttaagaataattttaggtgtgctctct 
atagtgaaaagagttaggcagggatactcacctttgtcgtttcagacccttatcccaaac 
ccgagggaacccgacaggctcagaggaatcgaagaagaaggtggagagcaagacaaagac 
agatcaattcgattagtgagcggattcttagcacttgcctgggacgacctgcggagcctg 
cgcctcttcagctaccaccaattgagagacttcatattgattgtggcgagagcagtggaa 
cttctgggacagagcagtctcaggggactacagagggggtgggaagcccttaagtatctg 
ggaaatcttgtgcagtattggggtctggaactaaaaaagagtgctattagtctgcttgat 
accatagcaatagcagtagctgaaggaacagataggattgttgaaataatacagagaatt 
tgtagagctatccgcaacatacctagaagaataagacagggctttgaagcagctttgcta 
taa 
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FIGURE 75 (SEQ ID NO:78) 



gtcgacaagagcagaagacagtggcaaggagtgagggggatacagaggaattggcaacaa 
tggtggatatggggcatcttaggcttttggatgttaatgatttgtaatgtgttgggaaac 
ttgtgggtcacagtgtattatggggtacctgtgtggaaagaagcaataactactctattc 
tgtgcatcaaatgctaaagcatatgagagggaggtgcataatgtctgggctacacatgcc 
tgtgtacccacagaccccaacccacaagaaatagttttgggaaatgtaacagaaaatttt 
aatatgtggaaaaatgacatggtggatcaaatgcatgaggatataatcagtttatgggat 
caaagcctaaagccatgtgtaaagttgaccccactctgtgtcactttagaatgtacaggg 
gttaaggctaccaataatagtagtgccaccaatagtagtaatgttaccaacaatgatgaa 
ataaaaaattgctctttcaatgcaaccacagaaataaaagacaagaagcacaaagagtat 
gcacttttttataggctcgatatagtaccacttaataatggcaaccctagtgagggcaat 
tctagtgagaagtatagattaataaattgtaatacctcaaccttaacacaagcctgtcca 
aaggtctcttttgacccaattcctatacattattgcactccagctggttatgcgattcta 
aagtgtaataataagacattcaatgggacaggaccatgccataatgtcagtacagtacaa 
tgtacacatggaattaaaccagtggtatcaactcaactactgttaaatggtagcttagca 
gaagaagagataataattagatctgaaaatctgacaaacaatgctaaaataataatagta 
cagcttaataaatctgtagaaattgtgtgcacaagacccggcaataatacaagaaaaagt 
gtaaggataggaccaggacaaacattctatgcaacaggtgacataataggagacataaga 
caagcacattgtaacattactgaagataagtggaatgaaactttacaatgggtaggtaaa 
aaattaggagagctcttccctaataaaacaatagaatttaagccatcctcaggaggggac 
ctagaaattacaacacatagctttaattgtagaggagaatttttctattgcaatacatca 
caactatttaatagtacatacaattctacacaaatgcataatgatacaggaagtaattca 
accatcacactcccatgcaaaataaagcaaattataaacatgtggcagggggtaggacgg 
gcaatgtatgcccctcccattgcaggaaacataacatgtaaatcaaatattacaggaata 
ctattagtacgtgatggaggcaacacaaatgacacaaatggcacaggaatattcagacct 
ggaggaggagatatgaaggacaattggagaagtgaattatataaatataaagtggtagaa 
attaagccattgggaatagcacccactgaagcaaaaaggagagtggtggagagagaaaaa 
ggagcagtaggaataggagctgtactccttgggttcttgggagcagcaggaagcactatg 
ggcgcagcgtcaataacgctgacggtacaggccaggcaattgttgtctggcatagtgcaa 
cagcaaagcaatttgctgagagctatagaggcgcaacagcatatgttgcaactcacggtc 
tggggcattaagcagctccaggcaagagtcctggctatagaaagatacctacaggatcaa 
cage tec taggactttggggctgctctggaaaactcatctgcaccactactgtgccttgg 
aactcaagttggagtaataaatctctaactgatatttgggataacatgacatggatgcag 
tgggatagagaaattaataattacacaaccacaatataccagttgcttgaaaaatcgcaa 
atccagcaggaacaaaatgagaaagatttattagcattggacaagtggcaaaatctgtgg 
aattggtttagcataacacagtggctatggtatataaaaatattcatcatgatagtagga 
ggcttgataggtttaagaataatttttgctgtgctatctatagtaaacagagttaggcag 
ggatactcacctctgtcatttcagacccttaccccaaacccgaggggacccgacaggctc 
ggaagaatcgaagaagaaggtggagagcaagacagagagagatccattcgattagtgagc 
ggattcttctcacttgcttgggacgatctgcggaacctgtgcctcttcagctaccaccga 
ttgagagacttcatattgattgcgacaagagtggtggaacttctggggcgcagggggtgg 
gaaacccttaaatatctaggaagtcttgggcagtattggggtctggaactaaaaaagagt 
gctattagtctgcttgatgccatagcaatagcagtagctgagggaacagataggattata 
gaattcatacaaagaatttgtagggctatccgcaacacacctagaagaataagacatggc 
ttttaagcagctttgcaataactctagaaagaaacaagggcgaattcc 
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gtcgacaagagcagaagacagtggcaatgagagtgagggggatacagaggaattggcaac 
aatggtggatatggggcatcctaggcttttggatgttaatgatttgtaatgtgttgggaa 
acttgtgggtcacagtgtattatggggtacctgtgtggaaagaagcaaaaactactctat 
tctgtgcatcagatgctaaagcatatgagagggaggtgcataatgtctgggctacgcatg 
cctgtgtacccacagaccccaacccacaagaaatagttttgggaaatgtaacagaaaatt 
ttaatatgtggaaaaatgacatggtggatcaaatgcatgaggatataatcagtttatggg 
atcaaagcctaaagccatgtgtaaagttgaccccactctgtgtcactttagaatgtacag 
gggttaaggctaccaataatagtagtgccaccaatagtagtaatgttaccaacaaagatg 
aaataaaaaattgctctttcaatgcaaccacagaaataaaagacaagaagcacaaagagt 
atgcacttttttataggctcgatatagtaccacttaataatggcaaccctagtgagggca 
attctagtgagaagtatagattaacaaattgtaatacctcaaccttaacacaagcctgtc 
caaaggtctcttttgacccaattcctatacattattgcactccagctggttatgcgattc 
taaagtgtaataataagacattcaatgggacaggaccatgccataatgtcagtacagtac 
aatgtacacatggaattaaaccagtggtatcaactcaactactgttaaatggtagcttag 
cagaagaagagataataattagatctgaaaatctgacaaacaatgctaaaataataatag 
tacagcttaataaatctgtagaaattgtgtgcacaagacccggcaataatacaagaaaaa 
gtgtaaggataggaccaggacaaacattctatgcaacaggtgacataataggagacataa 
gacaagcacattgtaacattactgaagataaatggaatgaaactttacaatgggtaggta 
aaaaattaggagagctcttccctaataaaacaatagaatttaagccatcctcaggagggg 
acctagaaattacaacacatagctttaattgtagaggagagtttttctattgcaatacat 
cacaactatttaatagtacatacaattctacacaaatgcataatgatacaggaagtaatt 
caaccatcacactcccatgcaaaataaagcaaattataaacatgtggcagggggtaggac 
gggcaatgtatgcccctcccattgcaggaaacataacatgtaaatcaaatattacaggaa 
tactattagtacgtgatggaggcaacacaaatgacacaaatggcacagaaatattcagac 
ctggaggaggagatatgaaggacaattggagaagtgaattatataaatataaagtggtag 
aaattaagccattgggaatagcacccactgaagcaaaaaggagagtggtggagagagaaa 
aaagagcagtaggaataggagctgtactccttgggttcttgggagcagcaggaagcacta 
tgggcgcagcgtcaataacgctgacggtacaagccaggcaattgttgtctggcatagtgc 
aacagcaaagcaatttgctgagagctatagaggcgcaacagtatatgttgcaactcacgg 
tctggggcattaagcagctccaggcaagagtcctggctatagaaagatacctacaggatc 
aacagctcctaggactttggggctgctctggaaaactcatctgcaccactactgtgcctt 
ggaactcaagttggagtaataaatctctaactgatatttgggataacatgacatggatgc 
agtgggatagagaaattaataattacacaaccacaatataccagttgcttgaaaaatcgc 
aaatccagcaggaacaaaatgagaaagatttattagcattggacaagtggcaaaatctgt 
ggaattggtttagcataacacagtggctatggtatataaaaatattcatcatgatagtag 
gaggcttgataggtttaagaataatttttgctgtgctatctatagtaaacagagttaggc 
agggatactcacctctgtcatttcagacccttaccccaaacccgaggggacccgacaggc 
tcggaagaatcgaagaagaaggtggagagcaagacagagagagatccattcgattagtga 
cjoggattcttctcacttgcttgggacgatctgcggaacctgtgcctcttcagctaccacc 
gattgagagacttcatattgattgtgacgagagtggtggaacttctggggcgcagggggt 
gggaaacccttaaatatctaggaagtcttgggcagtattggggtctggaactaaaaagga 
gtgctattagtctgcttgatgccatagcaatagcagtagttgagggaacagataggatta 
tagaattcatacaaagaatttgtagggctatccgtaacacacctagaagaataagacagg 
gctttgaagcagctttgcaataactctagaaagaaacaagggcgaattcc 
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atgagagtga tggggatcaa gaggaattgt caacaatggt ggatatgggg catcttaggc 
ttttgggtgc ttatgatttg taatgtaatg gggaacttgt gggtcacagt ctattatggg 
gtacctgtgt ggagagaagc aaaaactaca ctattctggg catcagatgc taaagcatat 
gagaaagaag tgcataatgt ttgggctaca catgcctgtg tacccacaga ccccaaccca 
caagaaatag ttttggaaaa tgtaacagaa aattttaaca tgtgggaaaa taacatggta 
gaccagatgc atgaggatat aatcagttta tgggatcaaa gtctaaaacc atgtgtaaag 
ttgaccccac tctgtgtcac tttaaattgt agaaatgtaa cggttactac taacaatgat 
aataatgtta cttacaataa tagcatacct gaagaaataa aaaattgctc tttcaatata 
accacagaaa taagagacaa gaaaaagata gaatatgcac ttttttatag acttggtata 
gtaccgctta aggagaacaa acttaattcc agtgagtata gattaataaa ttgtaatacc 
tcagccataa cacaagcctg tccaaaggtc tcttttgacc caattcctat acattattgt 
gctccagctg gttatgcgat actaaagtgt aataataaga cattcaatgg aacaggacca 
tgcaataatg tcagcactgt acagtgtaca catggaatta agccagtggt atcaactcaa 
ctactgttaa atggtagtct agcagaggaa gagataataa ttagatctaa aaatatgaca 
aacaatgtca aaacaataat agtacatctg aatgaatctg tagaaattgt gtgtacaagg 
cccaacaata atacaagaag aagtatgagg ataagaccag gacaaacatt ctatgcaaca 
ggagaaataa taggagacat aagacaagca tattgtaaaa ttagtgaaga tcaatggaat 
aaaactttac gcagggtaag tgaaaaatta agagaacact tccctgataa aacaataaaa 
tttgaaccac cctcaggagg agacttagaa attacaacac atagctttaa ttgtagagga 
gaatttttct attgcaatac atcagaactg tttaatagta catacatgcc taatggtaca 
gaaagtaata caagcaaaac catcatactc ccatgcagaa taaaacaaat tataaatatg 
tggcaggggg taggacgagc aatgtatgcc cctcccattg caggaaacat aacatgtcaa 
tcaaatatca caggaatact attgacccgt gatggaggag aagagtcaaa gtcaaatgga 
acagagatat tcaggcctgc aggaggggat atgaaggaca attggagaag tgaattatat 
agatataaag tggtagaaat taaaccatta ggagtagcac ccactgaggc aaaaaggaga 
gtggtggaga gagaaaaaag agcagtggga ataggagctg tgttccttgg gttcttggga 
gcagcaggaa gcactatggg cgcggcgtca ataacgctga cggtacaggc cagacaaccg 
ttttctggta tagtgcaaca gcaaagcaat ttgctgaggg ctatagaggc gcaacagcat 
atgttgcaac tcacagtctg gggcattaag cagctccaga caagagtcct ggctgtagaa 
agatacctaa aggatcaaca gctcctaggg ctttggggct gctctggaaa actcatctgc 
accactgccg tgccttggaa ctccagttgg agtaataagt ctcaaacaga tatttgggat 
aacatgacat ggatgcagtg ggatagagag atcagtaact acacagaaac aatatacaag 
t.tgcttgaag actcgcaaaa ccagcaggaa caaaatgaaa aggatttact agcattggac 
agttggaaaa atctgtggaa ttggtttgat ataacaaaat ggctgtggta tataaaaata 
ttcataatga tagtaggagg cttgataggt ttaagaataa tttttgctgt gctatctata 
ataaatagag ttaggcaggg atactcacct ttgtcattac agacccttac cccaaacccg 
aggggaccag acaggctcgg aagaatcgaa gaagaaggtg gagagcaaga cagagacaga 
tccgtgagat tagtgaacgg attcttagca cttgtctggg acgacctgcg gagcctgtgc 
ctcttcagct accaccaatt gagagactta atattgattg tagcgagagc agtggaagtt 
ctgggacgca acagtctcag gggactacag acggggtggg aagctcttaa gtatctggga 
aaccttgtgc tgtattgggg tctggagctg aaaaggagcg ctattagtct gttggataca 
acagcaatag tagtagctga aggaacagat aggatttttg aagcaatatg cagaatttgt 
agagctatcc gtaacatacc tagaagaata agacggggct ttgaagcagc tttgctataa 
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ggatccacta gtaacggccg ccagtgtgct 
cagaagacag tggcaatgag agtgcagggg 
tggggcatct taggcttttg gataataatg 
acagtttatt atggggtacc tgtgtggaaa 
gatgctaaag catatgagaa agaagtgcat 
acagacccca acccacaaga aatagttttg 
aaaaatgata tggtggatca gatgcatgag 
aagccatgtg taaagttgac cccactttgt 
aatagtactg aaatgtatag gaaaaccaca 
gaaatgaaaa attgctcttt caatgcaacc 
tatgcacttt tttatcgact ggatatagta 
tatagattaa taaattgtaa tacctcaacc 
gaticcaattc ctatacatta ttgtactcca 
aagacattca gtgggacggg accatgcaat 
attaagccag tggtatcaac tcaactactg 
ataattagat ctaaaaatct gacagacaat 
tctatagcaa ttatgtgtac aagacctggc 
ccaggacaag cattctttgc aacaggagca 
aacattagcg aaggtgaatg gaatagaact 
cacttccctg gtaaaagaat aagatttgca 
acacatagct ttaattgtgg aggagaattt 
aggacataca atacaacaca actgtttaat 
aatttcacac tcccatgcag aataaaacaa 
gcaatgtatg ctcctcctat aaaaggaaac 
ctgttggtgc gtgatggagg agaagacaat 
cctggaggag gagatatgag ggacaattgg 
gaaattaagc cattgggaat agcacctact 
aaaagagcag tgggaatagg agctgtgttc 
atgggcgcgg cgtcaataac gctgacggta 
caacagcaaa gcaatttgct gagggccata 
gtctggggca ttaaacagct ccagacaaga 
caacagctcc taggaatttg gggctgctct 
tggaactcca gttggagtaa tagaactgag 
caatgggata gagaaattag taattactca 
caaaaccagc aggaacaaaa tgaaaaggat 
tggagttggt ttaacatatc aaattggctg 
ggaggcttga taggtttaag aataattttt 
cagggatact cacctttgtc gttgcagacc 
ctcagaggaa tcgaagaaga aggtggagag 
agcggattct tagcacttgc" ttgggacgac 
caattgagag acttcatatt gattgtagcg 
tgggaagccc ttaaatatct gggaagtctt 
agtgctatta atctgcttga tactatagca 
atagaattaa tactaggact tggtagagct 
ggctttgaag cagctttgca ataactctag 
atcacactgg cggccgc 



ggaattcgcc cttccacgcg* tcgacaagag 
atactgagga attgtcaaca atggtggaca 
acttgtaatg tggtgggaaa cttgtgggtc 
gaagcaaaaa ctactctatt ctgtgcatca 
aatgtttggg ctacacatgc ctgtgtaccc 
gaaaatgtaa cagaaaattt taatatgtgg 
gatgtaatca gtttatggga ccaaagccta 
gtcactttaa attgtacaga tgttgataaa 
aatgataatg gtaatgatac catagataga 
acagacatac aagataagaa aacgggagtg 
ccactcaatg atactaacaa ctctagggag 
atgacacaag cctgtccaaa ggtctctttt 
gctggttatg cgattctaaa gtgtaataat 
aatgtcagca cagtacaatg tacacatgga 
ttaaatggta gcctagcaga aaaagagata 
gccaaaacaa taatagtaca tcttaatgaa 
aataatacaa gaaaaagtat aaggatagga 
ataataggag atataagaaa agcatattgt 
ttacaaaggg taggtagaaa attagcagaa 
ccaccttcag gaggggacct ggaaattaca 
ttctattgca atacaacaca actgtttaat 
ggtacataca gctctaacga tacagaaagt 
attataaaca tgtggcagga ggtaggacga 
ataacatgta actcaaatat cacaggatta 
aacacagaaa atgacacaga gaccttcaga 
agaagtgaat tatacaaata taaagtggta 
ggggcaaaaa ggagagtggt ggagagagaa 
cttgggttct tgggagcagc aggaagcact 
caggccagac aattattgtc tggtatagtg 
gaggcgcaac aacatatgtt gcaactcaca 
gtattggcca tcgaaagata cctaaaggat 
ggaaaactca tctgcaccac tgctgtgcct 
ggagatattt ggaataacct gacctggatg 
gacacaatat acaggttgct tgaagcatcg 
ttattggcct tgagcaattg gcaaaatctg 
tggtatataa gaatattcat aatgatagta 
gctgtgctct ctttagtgaa taaagttagg 
cttaccccga acccaagggg acccgacagg 
caagacagag acagatccgt tcgattagtg 
ctgcggagcc tgtgcctttt cagctaccac 
agagcggtgg aaattctggg acgcaggggg 
gtgcagtact ggggtctgga acttaaaaag 
atagcagtag ctgaaggaac agataggatt 
atctgcaaca tacctagaag aataagacag 
actagctaag ggcgaattct gcagatatcc 
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atgggtgcga gagcgtcaat 

ttaaggccag ggggaaagaa 
ctggaaagat ttgcacttaa 
ataaaacagc tacaaccagc 
acagtagcaa ctctctattg 
ttagacaaga tagaggaaga 
gctgacgaaa aggtcagtca 
caccaagcta tatcacctag 
ttcaacccag aggtaatacc 
ttaaacacca tgttaaatac 
accatcaatg aggaggctgc 
gcaccaggcc agatgagaga 
caggaacaaa tagcatggat 
agatggataa ttctggggtt 
gacataaaac aagggccaaa 
ttaagagctg aacaagctac 
caaaatgcga acccagattg 
gaagaaatga tgacagcatg 
gctgaggcaa tgagccaaac 
cctaacagaa ttgttaaatg 
agggccccta ggaaaaaggg 
tgtactgaga ggcaggctaa 
gggaatttcc tccagaacag 
ccagcagaga gcttcaggtt 
gaacctttaa cttccctcaa 



attaagcggc ggaaaattag 

acattatatg ttaaaacatc 
ccctggcctg ttagaaacat 
tcttcagaca ggaacagagg 
tgtacataaa gggataaagg 
acaaaacaaa tgtcagcaaa 
aaattatcct atagtacaga 
aacattgaat gcatgggtaa 
catgtttaca gcattatcag 
agtgggggga catcaagcag 
agaatgggat aggacacatc 
accaagggga agtgacatag 
gacaagtaat ccacctattc 
aaataaaata gtaagaatgt 
agaacccttt agagattatg 
acaagatgta aaaaattgga 
taagaccatt ttaagagcat 
tcagggagtg ggaggaccta 
aaacagtaac atactagtgc 
tttcaactgt ggcaaagtag 
ctgttggaaa tgtggacagg 
ttttttaggg aaaatctggc 
accagagcca acagccccac 
cgaggagaca acccccgtgc 
atcactcttt ggcagcgacc 



ataaatggga aagaattagg 
tagtatgggc aagcagggag 
cagaaggctg taaacaaata 
aacttagatc attattcaac 
tacgagacac caaggaagcc 
aagcacagca ggcaaaagcg 
atgcccaagg gcaaatggta 
aagtaataga ggagaaggct 
aaggagccac cccacaagat 
ccatgcaaat gttaaaagat 
cagtgcatgc agggcctgtt 
caggaactac tagtaccctt 
cagtaggaga catctataaa 
atagccctgt cagcattttg 
tagatcggtt ctttaaaact 
tgacagacac cttgttggtc 
taggaccagg ggcttcatta 
gccataaagc aagggtgttg 
agagaagcaa ttttaaaggc 
ggcacatagc cagaaagtgc 
aagggcacca aatgaaagac 
cttcccacaa ggggaggcca 
cagcagagcc aacagcccca 
cgaggaagga gaaagacagg 
cctcgtcaca ataa 
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atgggtgcga gagcgtcaat attaagcggc 
ttaaggccag ggggaaagaa acattatatg 
ctggaaagat ttgcacttaa ccctggcctg 
ataaaacagc tacaaccagc tcttcagaca 
acagtagcaa ctctctattg tgtacataaa 
ttagacaaga tagaggaaga acaaaacaaa 
gctgatgaaa aggtcagtca aaattatcct 
caccaagcta tatcacctag aacattgaat 
ttcaacccag aggtgatacc catgtttaca 
ttaaacacaa tgttaaatac agtgggggga 
accatcaatg aggaggctgc agaatgggat 
gcaccaggcc agatgagaga accaagggga 
caggaacaaa tagcatggat gacaagtaat 
agatggataa ttctggggtt aaataaaata 
gacataaaac aagggccaaa agaacccttt 
ttaagagctg aacaagctac acaagatgta 
caaaatgcga acccagattg taagaccatt 
gaagaaatga tgacagcatg tcagggagtg 
gctgaggcaa tgagccaaac aaacagtaac 
tctaacagaa ttgttaaatg tttcaactgt 
agggccccta ggaaaaaggg ctgttggaaa 
tgtactgaga gacaggctaa ttttttaggg 
gggaatttcc tccagaacag accagagcca 
ccagcagaga gcttcaggtt cgaggagaca 
gaacctttaa cttccctcaa atcactcttt 



ggaaaattag ataaatggga aagaattagg 
ttaaaacatt tagtatgggc aagcagagag 
ttagagacag cagaaggctg taaacaaata 
ggaacagagg aacttagatc attattcaac 
ggaatagagg tacgagacac caaggaagcc 
tgtcaacaaa aggcacaaca ggcaaaagcg 
atagtacaga atgcccaagg gcaaatggta 
gcatgggtaa aagtaataga ggagaaggct 
gcattatcag aaggagccac cccacaagat 
catcaagcag ccatgcaaat gttaaaagat 
aggacacatc cagtgcatgc agggcctgtt 
agtgacatag caggaactac tagtaccctt 
ccacctattc cagtagggga catctataaa 
gtaagaatgt atagccctgt tagcattttg 
agagattatg tagatcggtt ctttaaaact 
aaaaattgga tgacagacac cttgttggtc 
ttaagagcat taggaccagg ggcttcatta 
ggaggaccta gccataaagc aagggtgttg 
atactagtgc agagaagcaa ttttaaaggc 
ggcaaggtgg ggcacatagt cagaaattgc 
tgtggacagg aagggcacca aatgaaagac 
aaaatctggc cttcccacaa ggggaggcca 
acagccccac cagcagaacc aacagcccca 
acccccgtgc cgaagaggga gaaagagagg 
ggcaacgacc cctcgtcaca ataa 
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atgggtgcga gagcgtcagt attgaaaggg 
ttaaggccag ggggaaagaa acactatatg 
ctggaaagat ttgcacttaa ccctggcctt 
atgcaacagc tacaatcagc tcttcagaca 
acagtagcaa ctctctattg tgtacataaa 
ttagacaaga tagaggaaga acaaaataag 
gctgacaaag gaaaggtcag tcaaaattat 
gtacaccagg ccatatcacc gagaacttta 
gctttcagcc cagaggtaat acccatgttt 
gatttaaaca ccatgttaaa tacagtgggg 
gataccatca atgaggaggc tgcagaatgg 
attgcaccag gccaaatgag agaaccaagg 
cttcaagaac aaatagcatg gatgacaagt 
aaaagatgga taattctggg gttaaataaa 
ttggacataa aacaagggcc aaaagaaccc 
actttaaggg ctgaacaatc ttcacaagag 
gtccaaaatg caaacccaga ttgtaagacc 
ttagaagaaa tgatgacagc atgtcaggga 
ttggctgagg caatgagcca agcaaataca 
ggccctaaaa gaactgttaa atgtttcaat 
tgcagggccc ctaggaaaaa gggctgttgg 
gactgtactg aaaggcaggc taatttttta 
tcggggaatt tccttcagag cagaccagag 
ttcgaggagc gggagccgaa agacaaggaa 
ggcagcgacc cctcgtcaca ataa 



aaaaaattag atacatggga aagaattagg 

ctaaaacacc tagtatgggc aagc.agggag 
ttagaaacag- cagaaggctg taaacaaata 
ggaacagagg aacttagatc attatataac 
gagatagatg tacgagacac caaggaagcc 
agtcagcaaa aaacacagca agcagaagcg 
ccaatagtgc agaatctcca agggcaaatg 
aatgcatggg taaaagtaat agaagagaag 
acagcattat cagaaggagc taccccacaa 
ggacaccaag cagccatgca aatgttaaaa 
gataggttac atccagtgca tgcagggcct 
ggaagtgaca tagcaggaac tactagtacc 
aacccaccta ttccggtggg agacatctat 
atagtaagaa tgtatagccc tgtcagcatt 
tttagagact atgtagaccg attctttaaa 
gtaaaaaatt ggatgacaga caccttgttg 
attttaagag cattaggacc aggggctaca 
gtgggaggac ctggccacaa agcaagagtt 
aacataatga tgcagaaaag caattttaaa 
tgtggcaagg aagggcatat agccagaaat 
aaatgtggaa aggaaggaca ccaaatgaaa 
gggaaaattt ggccttccta caaggggagg 
ccatcagctc caccagcaga gagcttcagg 
ccacccttaa cttccctcaa atcactcttt 
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atgggtgcga gagcgtcaat attaagaggg ggaaaattag ataaatggga aaaaattagg 
ttaaggccag ggggaaagaa acgctatatg ataaaacacc tagtatgggc aagcagagag 
ctggaaaaat tcgcacttaa ccctggcctt ttagagacat cagaaggatg taaacagata 
atgaaacagc tacaaccagc tcttcagaca ggaacagagg aacttagatc attattcaac 
accatagcag ttctctattg tgtacatgaa aagatagagg tacaagacac caaggaagcc 
ttagacaaga tagaggaaga acaaaacaaa agtcagcaaa aaacacagca ggcagcagca 
gctgacggaa aagtcagtca aaattatcct atagtgcaga atgcccaagg gcaaatggtg 
caccagagca tatcacctag gactttgaat gcatgggtaa aagtaataga ggagaaggct 
tttagcccag aggtaatacc catgtttaca gcattatcag aaggagccac ctcacaagac 
ttaaacacca tgctaaatac agtgggggga catcaagcag ccatgcaaat gttaaaagat 
accatcaatg aggaggctgc agaatgggat agaatacatc cagtacatgc ggggcctatt 
gcaccaggcc aaatgagaga accaagggga agtgacatag caggaactac tagtaccctt 
caggaacaaa tagcatggat gacaagtaat ccacctatcc cagtgggaga catctataaa 
agatggataa ttttggggtt aaataaaata gtaagaatgt atagccctgt cagcattttg 
gacataaaac aagggccaaa ggaacccttt agagactatg tagacaggtt ctttaaaact 
ttaagagctg aacaagctac acaagatgta aaaaattgga tgacagaaac cttgttggtc 
caaaatgcaa acccagattg taagaccatt ttaagagggt taggaacagg ggctacatta 
gagggaatga tgacagcatg tcagggagtg ggaggacctg gccataaagc aagagtgtta 
gctgaagcaa tgagccaagc aacatataac ataatgatgc agagaagcaa ttttaaaggc 
tctagaaaaa ttgttaaatg tttcaactgt ggcaggaaag ggcacatagc cagaaattgc 
agggccccta gaaaaaaggg ctgttggaaa • tgtggaaagg aaggacacca aatgagagaa 
tgtactgaaa agcaggctaa ttttttaggg aaaatttggc cttcccacaa ggggaggcca 
gggaatttcc ttcagagcag accagagcca acagccccac cagcagagag cttcaggttc 
gaggagacac cccccgcgat gaagcaggaa ccgaaagaca gggaaccctt aacttccctc 
aaatcactct ttggcagcga cccctcgtca caataa 
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atgggtgcga 
ttaaggccag 
ctggaaagat 
ataaaacagc 
accgtggtaa 
ttagacaaga 
gctgacggaa 
caccaagcca 
tttagcccag 
ttaaacacca 
accatcaacg 
gcaccaggcc 
caggaacaaa 
agatggataa 
gacataaaac 
ttaagagctg 
caaaatgcga 
gaagaaatga 
gctgaggcaa 
ggagccaaaa 
tgcagggccc 
gactgtactg 
ccagggaatt 
ttcgaggaga 
ctcaaatcac 



gagcgtcaat 
ggggaaagaa 
ttgcacttaa 
tacatccagc 
ctctttattg 
tagaggaaga 
aagtcagtca 
tatcacctag 
aggtaatacc 
tgttaaatac 
aggaggctgc 
aaataagaga 
taacatggat 
ttctggggtt 
aagggccaaa 
aacaggctac 
acccagattg 
tgacagcatg 
tgagccaaac 
gaattgttaa 
ctaggaaaaa 
agaggcaggc 
tccttcagaa 
caacacccac 
fcctttggcag 



attaagaggg 
acattatatg 
ccctggcctt 
tcttcagaca 
cgtacatgca 
acaaaacaaa 
aaattatcct 
aaccttgaat 
catgtttaca 
agtgggggga 
agaatgggat 
accaagggga 
gacaagtaac 
aaataaaata 
ggaacccttt 
acaagaagta 
taagaccatt 
tcagggagtg 
aaacagtgca 
atgcttcaac 
aggctgttgg 
taatttttta 
cagaccagag 
tccgaagcag 
cgacccctcg 



ggaaaattag 
ataaaacacc 
ttagagacag 
ggaacagagg 
gagatagagg 
agtcagcaaa 
atagtacaga 
gcatgggtaa 
gcattatcag 
catcaagcag 
agattacatc 
agtgacatag 
ccacctgttc 
gtaaggatgt 
agagactatg 
aaaggctgga 
ttaagagcat 
ggaggaccta 
agcataatga 
tgtggcaagg 
aaatgtggac 
gggaaaattt 
ccaacagcac 
gagccgaagg 
tcacaataa 



ataaatggga 
tagtatgggc 
cagagggctg 
aacttagatc 
tacgagacac 
aaacacagca 
atctccaagg 
aagtaataga 
aaggagccac 
ccatgcaaat 
cagcacaggc 
caggaactac 
cagtgggaga 
atagccctgt 
tagaccggtt 
tgacagacac 
taggaccagg 
gccacaaggc 
tgcagaaaag 
aggggcacat 
aggaaggaca 
ggccttccca 
caccagcaga 
acagggaacc 



aaaaattagg 
aagcagggag 
taaacaaata 
attatacaac 
caaggaagcc 
ggcaaaagcg 
gcgaatggta 
ggaaaaggct 
cccccaagac 
gttaaaagat 
agggcctgtt 
tagtaccctt 
aatctataaa 
cagcattttg 
ctttaaaact 
cttattggtc 
ggctacacta 
aagagtgttg 
caattttaaa 
agccagaaat 
ccaaatgaaa 
caaaggaagg 
gagcttcagg 
tttaacttcc 
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atgggtgcga 
ttaaggccag 
ctggaaagat 
ataaaacagc 
accgtggcaa 
ttagacaaga 
gctgacggaa 
caccaggcca 
tttagcccag 
t taaacacca 
accatcaacg 
gcaccaggcc 
caggaacaaa 
agatggataa 
gacataaaac 
ttaagagctg 
caaaatgcga 
gaagaaatga 
gctgaggcaa 
ggagccaaaa 
tgcagggccc 
gactgtactg 
ccagggaatt 
ttcgaggaga 
ctcaaatcac 



gagcgtcaat 
ggggaaagaa 
ttgcacttaa 
tacatccagc 
ctctttattg 
tagaggaaga 
aagtcagtca 
tatcacctag 
aggtaatacc 
tgttaaatac 
aggaggctgc 
aaataagaga 
taacatggat 
ttctggggtt 
aagggccaaa 
aacaggctac 
acccagattg 
tgacagcatg 
tgagccaaac 
gaattgttaa 
ctaggaaaaa 
agagacaggc 
tccttcagaa 
caaca'cccac 
tctttggcag 



attaagaggg 
acattatatg 
ccctggcctt 
tcttcagaca 
cgtacatgca 
acaaaacaaa 
aaattatcct 
aaccttgaat 
catgtttaca 
agtgggggga 
agaatgggat 
accaagggga 
gacaagtaac 
aaataaaata 
ggaacccttt 
acaagaagta 
taagaccatt 
tcagggagtg 
aaacagtgca 
atgcttcaac 
aggctgttgg 
taatttttta 
cagaccagag 
tccgaagcag 
cgacccctcg 



ggaaaattag 
ataaaacacc 
ttagagacag 
ggaacagagg 
gagatagagg 
agtcagcaaa 
atagtacaga 
gcatgggtaa 
gcattatcag 
catcaagcag 
agattacatc 
agtgacatag 
ccacctgttc 
gtaaggatgt 
agagactatg 
aaaggctgga 
ttaagagcat 
ggaggaccta 
agcataatga 
tgtggcaagg 
aaatgtggac 
gggaaaattt 
tcaacagcac 
gagccgaagg 
tcacaataa 



ataaatggga 
tagtatgggc 
cagagggctg 
aacttagatc 
tacgagacac 
aaacacagca 
atctccaagg 
aagtaataga 
aaggagccac 
ccatgcaaat 
cagcacaggc 
caggaactac 
cagtgggaga 
atagccctgt 
tagaccggtt 
tgacagacac 
taggaccagg 
gccacaaggc 
tgcagaaaag 
aggggcacat 
aggaaggaca 
ggccttccca 
caccagcaga 
acagggaacc 



aaaaattagg 
aagcagggag 
taaacaaata 
attatataac 
caaggaagcc 
ggcaaaagcg 
gcaaatggta 
ggaaaaggct 
cccccaagac 
gttaaaagat 
agggcctgtt 
tagtaccctt 
aatctataaa 
cagcattttg 
ctttaaaact 
cttattggtc 
ggctacacta 
aagagtgttg 
caattttaaa 
agccagaaat 
ccaaatgaaa 
caaaggaagg 
gagcttcagg 
tttagcttcc 
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atgggtgcga 
aaggccaggg 
ggaaagattt 
gaaccagcta 
agtagcaact 
agacaagata 
tggcgaaaag 
ccaagctata 
cagcccagag 
aaacaccatg 
catcaatgag 
accaggacag 
ggaacaaata 
atggataatt 
cataaaacaa 
aagagccgaa 
aaatgcgaac 
agaaatgatg 
tgaggcaatg 
ctctaagaga 
cagggcccct 
ctgtactgag 
"agggaatttc 
cgaggagaca 
ctttggcagc 



gcgtcaatat 
ggaaagaaac 
gcacttaacc 
caaccatctc 
ctctattgtg 
gaggaagaac 
gtcagtcaaa 
tcacctagaa 
gtaataccca 
ttaaatacag 
gaagctgcag 
atgagagaac 
gcatggatga 
ctggggttaa 
gggccaaagg 
caggctacac 
ccagattgta 
acagcatgtc 
agccaagcaa 
attgttaaat 
aggaaaaagg 
aggcaggcta 
cttcagaaca 
acccctgcgc 
gacccctcgt 



taaaaggggg 
actatatgat 
ctggcctgtt 
ttcagacagg 
tacatgaaaa 
aaaacaaaag 
attatcctat 
cgttaaatgc 
tgtttacagc 
tgggaggaca 
aatgggatag 
caaggggaag 
caagtaatcc 
ataaaatagt 
aaccctttag 
aagatgbaaa 
agaccatttt 
agggagtggg 
acaatataaa 
gcttcaactg 
gctgttggaa 
attttttagg 
ggccagagcc 
cgaagcagga 
cacaataa 



aaaattagat 
aaaacattta 
agagacatca 
aacagaagaa 
gatagaggta 
ccagcaaaaa 
agtgcagaat 
atgggtaaaa 
attatcagaa 
tcaagcagcc 
ggtacatcca 
tgacatagca 
acctattcca 
aagaatgtat 
ggactatgta 
aaattggatg 
aagagcatta 
aggacctagc 
catactgatg 
tggcaaggaa 
atgtggaaag 
gaaaatttgg 
aacagcccca 
caaggaaccc 



gcatgggaaa 
gtatgggcaa 
gaaggatgta 
cttagatcat 
cgagacacca 
acacaacagg 
gcccaagggc 
gtaatagagg 
ggagccaccc 
atgcaaatgt 
gtgcatgcag 
ggaactacta 
gtaggagaaa 
agccctgtca 
gaccggttct 
acagacacct 
ggaccagggg 
cacaaagcaa 
cagagaagca 
gggcacatag 
gaaggacacc 
ccttcccgca 
ccagcagaaa 
ttaacttccc 



gaattaggtt 
gcagggagct 
aacaaataat 
tatacaacac 
aggaagcctt 
caaaagcggc 
aaatggtaca 
agaaggcttt 
cacaagattt 
taaaagatac 
ggcctgttgc 
gtaccctgca 
tttataaaag 
gcatcttgga 
ttaaaacttt 
tgttggtcca 
cttcattaga 
gagtgttggc 
attttaaggg 
ccagaaattg 
aaataaaaga 
aggggaggcc 
gcttcaggtt 
tcaaatcact 
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atgggtgcga 
ttaaggccag 
ctggaaagat 
atgaaccagc 
acagtagcaa 
ttagacaaga 
gctggcgaaa 
caccaagcta 
ttcagcccag 
ttaaacacca 
accatcaatg 
gcaccaggac 
caggaacaaa 
agatggataa 
gacataaaac 
ttaagagctg 
caaaatgcga 
gaagaaatga 
gctgaggcaa 
ggctctaaga 
tgcagagccc 
gactgtactg 
ccagggaatt 
ttcgagaaga 
ctctttggca 



gagcgtcaac 
ggggaaagaa 
ttgcacttaa 
tacaaccatc 
ctctctattg 
tagaggaaga 
aggtcagtca 
tatcgcctag 
aggtaatacc 
tgttaaatac 
aggaagctgc 
agatgagaga 
tagcatggat 
ttctggggtt 
aagggccaaa 
aacaagctac 
acccagattg 
taacagcatg 
tgagccaagc 
gaattgttaa 
ctaggaaaaa 
aaaggcaggc 
tccttcagaa 
caacccctgc 
gcgacccctc 



attaaaaggg 
acactatatg 
ccctggcctg 
tcttcagaca 
tgtacatgaa 
acaaaacaaa 
aaattatcct 
aacgttaaat 
catgtttaca 
agtgggagga 
agaatgggat 
accaagggga 
gacaagtaat 
aaataaaata 
ggaacccttt 
acaagatgta 
taagaccatt 
tcagggagtg 
aaacaatata 
atgcttcaac 
gggctgttga 
taatttttta 
caggccagag 
gccgaagcag 
gtcacaataa 



ggaaaattag 
ataaaacatt 
ttagagacat 
ggaacagaag 
aagatagagg 
agccagcaaa 
atagtgcaga 
gcatgggtaa 
gcattatcag 
catcaagcag 
agggtacatc 
agtgacatag 
ccacctattc 
gtaagaatgt 
agggactatg 
aaaaattgga 
ttaagagcat 
ggaggaccta 
aacatactga 
tgtggcaagg 
aaatgtagaa 
gggaaaattt 
ccaacagccc 
gacaaggaac 



atgcatggga 
tagtatgggc 
cagaaggatg 
aactt agate 
tacgagacac 
aaacacaaca 
atgcccaagg 
aagtaataga 
aaggagccac 
ctatgcaaat 
cagtgcatgc 
caggaactac 
cagtaggaga 
atagccctgt 
tagaccggtt 
tgacagacac 
tagggccagg 
gccacaaagc 
tgcagagaag 
aagggcacat 
aagaaagaca 
ggccttccca 
caccagcaga 
ccttaacttc 



aagaattagg 
aagcagggag 
taaacaaata 
attatacaac 
caaggaagcc 
ggcaaaggcg 
gcaaatggta 
ggagaaggct 
cccacaagat 
gttaaaagat 
aaggcctgtt 
tagtaccctg 
aatttataaa 
cagcatcttg 
ctttaaaact 
cttgttggtc 
ggcttcatta 
aagagtgttg 
caattttaag 
agccaaaaat 
ccaaatgaaa 
caaggggagg 
aagcttcagg 
cctcaaatca 
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atgggtgcga gagcgtcaat attaagaggg 
ttaaggccag ggggaaagaa aacctatagg 
ctggaaagat ttgcacttaa ccctggcctt 
ataagacagc tacacccagc tcttcagaca 
acagtagcaa ctctctattg tgtacatgca 
ttagacaaga tagaggaaga acaaaacaaa 
ggtaacgaaa agatcagtca aaattatcct 
caccaggcct tatcacctag aactttgaat 
ttcagcccag aggtaatacc catgtttaca 
ttaaacacca tgttaaacac agtggggggg 
accatcaatg aagaggctgc agaatgggat 
gcaccaggcc aaatgagaga accaagggga 
caggaacaaa tagcatggat gacaagtaac 
agatggataa ttctggggtt aaataaaata 
gacattaaac aagggccaaa ggaacccttt 
ttaagagctg aacaagctac acaagatgta 
caaaatgcga acccagattg taagatcatt 
gaagaaatga tgacagcatg tcagggagtg 
gctgaggcaa tgagccaagc aaacagtgga 
ggctctaaaa gaattattaa atgttttaac 
tgtaaggccc ctaggaaaag aggctgttgg 
gactgtactg aaagacaggc taatttttta 
ccagggaatt tccttcagaa caggccagag 
ccaccagcag agagcctcag gatcgaggaa 
gacagggaac ccttaatctc cctcaaatca 



ggaaaattag ataaatggga agaaattagg 
ctaaaacatc tagtatgggc aagcagggag 
ttagagacag cagaaggctg taaacaaata 
ggaacggagg aacttagatc attatacaac 
aacatagagg taaaagacac caaggaagcc 
agtcagcaaa aatcagagca ggcaaaagta 
atagtgcaga atctccaagg gcaaatggta 
gcatgggtaa aagtaataga ggagaaggct 
gcattatcag aaggagccac cccacaagat 
catcaagcag ccatgcaaat gttaaaagac 
cgattacacc cagtacatgc agggcctatt 
agtgacatag caggaactac tagcaccctt 
ccacctattc cggtgggaga tatctataaa 
gtaagaatgt atagccctgt cagcattttg 
agagactatg tagaccggtt ctttaaaact 
aaaaattgga tgacagacac cttgttggtc 
ttaagaggat taggaccagg ggctacatta 
ggaggaccta gccacaaagc aagagtgttg 
aacataatga tgcagaaaag caattttaga 
tgtggcaagg aagggcacat agccaaaaat 
aaatgtggaa aggaaggaca ccaaatgaaa 
gggaaaattt ggccttcctg caaggggagg 
ccaacagccc caccagcaga gccaacagcc 
acaacccccg ctccgaagcc ggagccgagg 
ccctttggca gcgacccctc gtcacaataa 
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atgggtgcga gagcgtcagt 
ttaaggccag ggggaaagaa 
ctggaaagat ttgcacttaa 
atacaacagc tacaaccagc 
acagtagcaa ctctctattg 
ttagacaaga tagaggaaga 
gctgacaaaa aggtcagtca 
caccaagccc tatcacctag 
tttggcccag aggtaatacc 
ttaaacacca tgttaaatac 
accatcaatg aggaggctgc 
gcaccaggcc aaatgagaga 
caggaacaaa tagctcggat 
agatggataa ttctagggtt 
gacataaaac aggggccaaa 
ttaagagctg aacaagctac 
caaaatgcga acccagattg 
gaagaaatga tgacagcatg 
gctgaggcaa tgagccaagc 
tctaaaagaa ttgttaaatg 
agggccccta gaaaaaaggg 
tgtactgaaa ggcaggctaa 
gggaatttcc tccagagcag 
gaggagacaa cccccgctcc 
agatcactct ttggcaacga 



FIGURE 88(SEQIDNO: 



attaagaggc gaaaaattag 
acgctatatg ctaaaacaca 
ccctggcctt ttagagacat 
tcttcagaca ggaacagagg 
tgtacataaa aagatagagg 
acaaaacaaa agtcagcaaa 
aaattatcct atagtacaga 
aactttgaat gcatgggtaa 
catgtttaca gcattatcag 
agtgggggga catcaggcag 
agaatgggac agattacacc 
acctagggga agtgacatag 
gacaagtaac ccacctgtcc 
aaataaaata gtaagaatgt 
agaacccttt agagactatg 
acaagaggta aaaggttgga 
taagaccatt ttaagagcat 
tcagggagtg ggaggacctg 
aaacagtaac atacttatgc 
tttcaactgt ggcaaggaag 
ctgttggaaa tgtggaaaag 
ttttttaggg aaaatttggc 
accagagcca acagccccac 
gaagcaggag tcgaaagaca 
cccctcgtca caataa 



atacatggga aaaaattagg 
tagtatgggc aagcagggag 
cagaaggctg taaacaaata 
aacttaaatc gttattcaac 
ttcgagacac caaggaagcc 
aaacacagca ggcagaagcg 
acctccaagg gcaaatggta 
aagtaataga ggagaaggct 
aaggagccac cccagcagat 
ccatgcagat gttaaaagat 
cagtacatgc agggcctact 
caggaactac tagtaccctt 
cagtgggaga catctataaa 
atagccctgt cagcattttg 
tagaccggtt ctttaaaact 
tgacagacac cttgttggtc 
taggaccagg ggctacatta 
gccacaaagc cagagtgttg 
agagaagcaa ttttaaaggc 
ggcacatagc cggaaattgc 
aaggacacca aatgaaagaa 
cttcccacaa ggggaggcca 
cagcagagag cttcaggttc 
gggagccctt aacttccctc 
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atgggtgcga gagcgtcagt attaagaggc 
ttaaggccag ggggaaagaa acgctatatg 
ctggaaagat ttgcacttaa ccctggcctt 
atacaacagc tacaaccagc tcttcagaca 
acagtagcaa ctctctattg tgtacacaga 
ttagacaaga tagaggaaga acgaaacaaa 
gctgacaaaa aggtcagtca aaattatcct 
caccaggccc tatcacctag aactttgaat 
tttagcccag aggtaatacc catgtttaca 
ttaaacacca tgttaaatac agtgggggga 
accatcaatg aggaggctgc agaatgggac 
gcaccaggcc aaatgagaga acctagggga 
caggaacaaa tagcatggat gacaagtaac 
agatggataa ttctagggtt aaataaaata 
gacataaaac aggggccaaa agaacccttt 
ttaagagctg aacaagctac acaagaggta 
caaaatgcga acccagattg taagaccatt 
gaagaaatga tgacagcatg tcagggagtg 
gctgaggcaa tgagccaagc aaacagtaac 
tctaaaagaa ttgttaaatg tttcaactgt 
agggccccta gaaaaaaggg ctgttggaaa 
tgtactgaaa ggcaggctaa ttttttaggg 
gggaatttcc tccagagcag accagagcca 
gaggagacaa cccccgctcc gaagcaggag 
agatcactct ttggcaacga cccctcgtca 



gaaaaattgg atacatggga aaagattagg 
ctaaaacaca tagtatgggc aagcagggag 
ttagagacat cagaaggctg taaacaaata 
ggaacagagg aacttaaatc attattcaac 
aagatagagg tacgagacac caaagaagcc 
agtcagcaaa aaacacagca ggcagaagcg 
atagtacaga atctccaagg gcaaatggta 
gcatgggtaa aagtaataga ggagaaggct 
gcattatcag aaggagccac cccagcagat 
catcaagcag ccatgcagat gttaaaagat 
agattacacc cagtacatgc agggcctgct 
agtgacatag caggaactac tagtaccctt 
ccacctgtcc cagtgggaga catctataaa 
gtaagaatgt atagccctgt cagcattttg 
agagactatg tagaccggtt ctttaaaact 
aaaggttgga tgacagacac cttgttggtc 
ttaagagcat taggaccagg ggctacatta 
ggaggacctg gccacaaagc cagagtattg 
atatttatgc agagaagcaa ttttaaaggc 
ggcaaggaag ggcacatagc caaaaattgc 
tgtggaaaag aaggacacca aatgaaagac 
aaaatttggc cttcccacaa ggggaggcca 
acagccccac cagcagagaa cttcaggttc 
tcgaaagaca gggagccctt aacttccctc 
caataa 
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atgggtgcga gagcgtcaat attaagaggc 
ttaaggccag ggggaaagaa acactatatg 
ctggaaagat ttgcacttaa ccctggcctt 
atacaacagc tacacacagc tcttaagaca 
acagtagcaa ctctctactg tgtacatgca 
ttagacaaga tagaggagga gcaaaacaaa 
gctgacaaaa agaaggtcag tcaaaattat 
gcacaccaga acatatcacc aagaacttta 
ggtttcaacc cagaggtaat acccatgttt 
gatctgaaca ccatgttaaa tatagtgggg 
gataccatca atgaggaggc tgcagaatgg 
gttgcaccag gccaaatcag agatccaagg 
cttcaggaac aagtaacatg gatgacaaat 
aaaagatgga taattctggg attaaataaa 
ttggacatta gacaaggacc aaaggagcct 
actttaagag ctgaacaagc tacacaagat 
gtccaaaatg caaacccaga ttgtaagacc 
ttagaagaaa tgatgacagc atgtcaagga 
ttggctgagg caatgagcca agcaggcaat 
aaaggcccta gaagaactat taaatgcttc 
aattgcaggg cccctaggaa aaaaggctgt 
aaagactgta ctgagaggca ggctaatttt 
aggccaggga acttccttca gaacagacca 
aggttcgagg agacaacccc cgctcagaag 
tccctcaaat cactctttgg cggcgacccc 



ggaaaattag ataaatggga aaaaattaga 
ttaaaacaca tagtatgggc aagcagggag 
ttagagacat cagaaggctg taaacaaata 
ggaacagagg aacttacatc attatacaac 
gggatagagg tacgagacac caaggaggcc 
agtcagaaaa aaatgcagca agcagaagtg 
cctatagtac agaatcacca agggcaaatg 
aatgcatggg taaaagtaat agaggagaag 
acagcattat cagagggagc caccccttct 
ggacatcaag cagccatgca aatgttaaaa 
gatagattac acccagcaca ggcagggcct 
ggaagtgaca tagcaggaac tactagtacc 
aacccaccta ttccagtagg agacatctat 
atagtaagaa tgtatagccc tgtcagcatt 
tttagagact atgtagatcg gttctttaaa 
gtaaaaaatt ggatgacaga caccttgttg 
attttaagag cattaggacc aggggctaca 
gtgggaggac ctagccacaa agcaagagtc 
acaaacataa tgatgcagaa aagcaatttc 
aactgtggca aggaaggaca cctagccaga 
tggaaatgtg gaaaggaagg acaccaaatg 
ttagggaaaa tttggccttc ccactcgggg 
gagccaacag ccccaccagc agagagcttc 
caggagccgc aagacaggga acccttaact 
tcgtcacaat aa 
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atgggtgcga gagcgtcaat attaagaggg ggaaaattag ataaatggga aaaaattagg 
t-.taaggccag gggggaaaaa acactatatg ctaaaacacc tagtatgggc aagcagagag 
ctggaaagat ttgcagttaa ccctggcctt ttagagacat cagacggatg tagacaaata 
ataaaacagc tacaaccagc tcttcagaca ggaacagagg aaattagatc attatttaac 
acagtagcaa ctctctattg tgtacatgaa gggatagatg tacgagacac caaggaagcc 
ttagacaagt tggaggagga acaaaacaaa tgtcagcaaa aaacacagca ggcagaagcg 
gctgacaaaa aggtcagtca aaatt'atcct atagtgcaga acctccaagg gcaaatggta 
caccaggcca tatcacctag aaccttgaat gcatgggtaa aagtaataga ggagaaggct 
tttagcccag aggtaatacc catgtttaca gcattatcag aaggagccac cccacaagat 
ttaaacacca tgttaaatac agtgggggga catcaagcag ccatgcaaat gttaaaagat 
accatcaatg aggaggctgc cgaatgggat aggttacatc cagtacatgc agggcctgtt 
gcaccaggcc agatgagaga accaagggga agtgacatag cagaaactac tagtaccctt 
caagaacaaa tagcatggat gacaagtaac ccacctatcc cagtaggaga catctataaa 
aggtggataa ttctggggtt aaataaaata gtaagaatgt acagccctgt cagcattttg 
gacataaaac aaggaccaaa ggaacccttt agagactatg tagaccggtt cttcaaaact 
ttaagagctg aacaatctac acaagaggta aaaaattgga tgacagacac cttgttagtc 
caaaatgcga acccagattg taagaccatt ttaagagcat taggaccagg ggcttcatta 
gaagaaatga tgacagcatg tcagggagtg ggaggaccta gccacaaagc aagagctttg 
gctgaggcaa tgagccaagc aaacaatgca agtgtaatga tgcagaaaag caattttaaa 
ggccctagaa gtactgttaa atgtttcaac tgtggcaagg aagggcacat agccaggaat 
tgcagggccc ctaggaaaaa ggactgttgg aaatgtggaa aggaaggaca ccaaatgaaa 
gactgtactg agagacaggc taatttttta gggaaaattt ggccttccca caaggggagg 
ccagggaatt tccttcagag caggccagag ccaacagccc caccactaga gccaacagcc 
ccaccagcag agagcttcaa gttcgaggag actccgaagc gggagccgaa agacagggaa 
cccttaactt ccctcaaatc actctttggc agcgacccct cgtcacaata a 
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atgggtgcga gagcgtcaat 
ttaaggccag gggggaaaaa 
ctggacagat ttgcagttaa 
ataaaacagc tacaaccagc 
acagtagcaa ctctctattg 
ttagacaaga tagaggagga 
gctgacaaaa aggtcagtca 
caccaggcca tatcacctag 
tttagcccag aggtaatacc 
ttaaacacca tgttaaatac 
accatcaatg aggaggctgc 
gcaccaggcc agatgagaga 
caagaacaaa tagcatggat 
aggtggataa ttctggggtt 
gacataaaac aaggaccaaa 
ttaagagctg aacaatctac 
caaaatgcga acccagattg 
gaagaaatga tgacagcatg 
gctgaggcaa tgagccaagc 
ggccctagaa gagctgttaa 
tgcagggccc ctaggaaaaa 
gactgtactg agagacaggc 
ccagggaatt tccttcagag 
ccaccagcag agagcttcaa 
ccctacaggg aacccttaac 
taa 



attaagaggg ggaaaattag 
acgctatatg ctaaaacacc 
ccctggcctt ttagagacat 
tcttcagaca ggaacagagg 
tgtacataaa gggatagatg 
acaaaacaaa tgccagcaaa 
aaattatcct atagtgcaga 
aaccttgaat gcatgggtaa 
catgtttaca gcattatcag 
agtgggggga catcaagcag 
cgaatgggat aggttacatc 
accaagggga agtgacatag 
gacaagtaac ccacctatcc 
aaataaaata gtaagaatgt 
agaacctttt agagactatg 
acaagaggta aaaaattgga 
taagaccatt ttaagagcat 
tcagggagtg ggaggaccta 
aaacaataca agtgtaatga 
atgtttcaac tgtggcaagg 
gggctgttgg aaatgtggaa 
taatttttta gggaaaattt 
cagaccagag ccaacagccc 
gttcgaggag actccgaagc 
ttccctcaaa tcactctttg 



acaaatggga aaaaattagg 
tagtatgggc aagcagagag 
cagacggatg tagacaaata 
aaattagatc attatttaac 
tacgagacac caaggaagcc 
aaacacagca ggcggaagcg 
acctccaagg gcaaatggta 
aagtaataga ggagaaggct 
aaggagccac cccacaagat 
ccatgcaaat gttaaaagat 
cagtacatgc agggcctgtt 
cagaaactac tagtaccctt 
cagtaggaga catctataaa 
acagccctgt cagcattttg 
tagaccggtt cttcaaaact 
tgacagacac cttgttagtc 
taggaccagg ggcttcatta 
cccacaaagc aagagttttg 
tacagaaaag caattttaaa 
aagggcacat agccaggaat 
aggaaggaca ccaaatgaaa 
ggccttccca caagggaagg 
caccactaga accaacagcc 
aggagccgaa agacagggaa 
gcagcgaccc ctcgtcacaa 
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atgggtgcga 
ttaaggccag 
ctggaaagat 
atgaaacagc 
acagtagcaa 
ttagacaaga 
aaagcggctg 
atggtacatc 
aaggctttta 
caagatttaa 
aaagatccca 
cctatggcac 
acccttcagg 
tataaaagat 
attttggaca 
aaagccttaa 
ctggtccaaa 
acattggaag 
gtgttagctg 
tttaaaagct 
agaaattgca 
atgaaagatt 
gggaggccag 
ttcaggaaca 
acccccactc 
tttggcagcg 



gagcgtcaat 

ggggaaagaa 
ttgcacttaa 
tacacccagc 
ctctctattg 
tagaggaaga 
acgaaaaagt 
agaacctatc 
gcccagaggt 
acaccatgtt 
tcaatgaaga 
caggccaatt 
aacaaatagc 
ggataattct 
taagacaagg 
gagctgaaca 
.atgcgaaccc 
aaatgatgac 
aggcaatgag 
caaaaagaat 
gggcccctag 
gtactgagag 
ggaatttcct 
gaccagagcc 
cgaagcagga 
acccctcgtc 



attaagaggg 
acattatatg 
ccctggcctt 
tcttcagaca 
tgtacatgaa 
acaaaacaaa 
cagtcaaaat 
acctagaacc 
aatacccatg 
aaatacggtg 
ggctgcagaa 
gagagaacca 
atggatgaca 
ggggttaaat 
gccaaaggaa 
agctacacaa 
agattgtaag 
.agcatgtcag 
ccaagcaaac 
tgttaaatgt 
gaaaaagggc 
gcaggcaaat 
tcagaacaga 
aacggctcca 
gccgaaagac 
acaataa 



acgaaattag 
ttaaaacacc 
ttagaaacat 
ggaacagagg 
agcataaagg 
attaaaagtc 
tatcctatag 
ttgaatgcat 
tttacagcat 

gggggacatc 

tgggatagat 

aggggaagtg 

agtaatccac 
aaaatagtga 
ccctttagag 
gatgtaaaaa 
accattttaa 
ggagtggggg 
aatacaaaca 
ttcaactgtg 
tgttggaaat 
tttttaggga 
ccagagccaa 
ccagcagaga 
agggatccct 



atgcatggga 
tagtatgggc 
cggaaggctg 
aacttaaatc 
tacgagacac 
agcaaaaaac 
tgcagaatct 
gggtaaaagt 
tatcagaagg 
aagcagccat 
tacacccagt 
acatagcagg 
ctatcccagt 
gaatgtatag 
actatgtaga 
attggatgac 
aagcattagg 
gacctagtca 
taatgatgca 
gcaaggaagg 
gtggaaagga 
aaatttggcc 
cagccccacc 
gcttcaggtt 
taacttccct 



aaaaattagg 
aagcagggag 
taaacaaata 
attatacaac 
caaggaagcc 
acagcaggca 
tcaagggcaa 
aatagaggag 
agccacccca 
gcaaatgtta 
ccatgcgggg 
aactactagt 
gggagacatc 
ccctatcagc 
ccggttcttt 
agaaaccttg 
aataggggct 
caaagcaaga 
gagaagcaat 
gcatatagcc 
aggacaccaa 
ttcccacaag 
agcagagagt 
cgaggagaca 
caaatcactc 
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atgggtgcga 
ttaaggccag 
ctggaaagat 
atgaaacagc 
acagtagcaa 
ttagacaaga 
aaagcggctg 
atggtacatc 
aaggctttta 
caagatttaa 
aaagatacca 
cctatggcac 
acccttcggg 
tataaaagat 
attttggaca 
aaagccttaa 
ctggtccaaa 
acattggaag 
gtgttagctg 
tttaaaagct 
agaaattgca 
atgaaagatt 
gggaggccag 
ttcaggaaca 
acccccactc 
tttggcagcg 



gagcgtcaat 
ggggaaagaa 
ttgcacttaa 
tacacccagc 
ctctctattg 
tagaggaaga 
acgaaaaagt 
agaacctatc 
gcccagaggt 
gcaccatgtt 
tcaatgaaga 
caggccaatt 
aacaaatagc 
ggataattct 
taagacaagg 
gagctgaaca 
atgcgaaccc 
aaatgatgac 
aggcaatgag 
caaaaagaat 
gggcccctag 
gtactgagag 
ggaatttcct 
gaccagagcc 
cgaagcagga 
acccctcgtc 



attaagaggg 
acattatatg 
ccctggcctt 
tcttcagaca 
tgtacatgaa 
acaaaacaaa 
cagtcaaaat 
acctagaacc 
aatacccatg 
aaaitacggtg 
ggctgcagaa 
gagagaacca 
atggatgaca 
ggggttaaat 
gccaaaggaa 
agctacacaa 
agattgtaag 
agcatgtcag 
ccaagcaaac 
tgttaaatgt 
gaaaaagggc 
gcaggcaaat 
tcagaacaga 
aacggctcca 
gccgaaagac 
acaataa 



acgaaattag 

ttaaaacacc 
ttagaaacat 
ggaacagagg 
aacataaagg 
attaaaagtc 
tatcctatag 
ttgaatgcat 
. tttacagcat 

gggggacatc 

tgggatagat 
aggggaagtg 
agtaatccac 
aaaatagtga 
ccctttagag 
gatgtaaaaa 
accattttaa 
ggagtggggg 
aatacaaaca 
tccaactgtg 
tgttggaaat 
tttttaggga 
ccagagccaa 
ccagcagaga 
agggatccct 



atgcatggga 

tagtatgggc 
cagaaggctg 
aacttaaatc 
tacgagacac 
agcaaaaaac 
tgcagaatct 
gggtaaaagt 
tatcagaagg 
aagcagccat 
tacacccagt 
acatagcagg 
ctatcccagt 
gaatgtatag 
actatgtaga 
attggatgac 
aagcattagg 
gacctagtca 
taatgatgca 
gcaaggaagg 
gtggaaagga 
aaatttggcc 
cagccccacc 
gcttcaggtt 
taacttccct 



aaaaattagg 

aagcagggag 
taaacaaata 
attatacaac 
caaggaagcc 
acagcaggca 
tcaagggcaa 
aatagaggag 
agccacccca 
gcaaatgtta 
ccatgcgggg 
aactactagt 
gggagacatc 
ccctgtcagc 
ccggttcttt 
agaaaccttg 
aataggggct 
caaagcaaga 
gagaagcaat 
gcatatagcc 
aggacaccaa 
ttcccacaag 
agcagagagt 
cgaggagaca 
caaatcactc 
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atgggtgcga 
ctaaggccag 
ctggaaagat 
ataaaacagc 
acagtagcaa 
ttagacaaga 
gctgacgaag 
caccaggcca 
t ttagcccag 
t taaacacca 
accatcaatg 
gcaccaggcc 
caggaacaaa 
agatggataa 
gacataaaac 
ttaagagctg 
caaaatgcga 
gaagaaatga 
gctgaggcaa 
tctaaaagaa 
agggccccta 
tgtactgaaa 
gggaatttcc 
gaggaaacaa 
aaatcactct 



gagcgtcaat 
ggggaaggaa 
tcgcacttaa 
tacacccagc 
ctctctattg 
tagaggaaga 
gagtcagtca 
tatcacctag 
aagtaatacc 
tgttaaatac 
aggaggctgc 
aaatgaggga 
tagcatggat 
ttctggggtt 
aagggccaaa 
aacaagctac 
acccagattg 
tgacagcatg 
tgagccaagc 
ttgttaaatg 
gaaaaaaggg 
ggcaggctaa 
tccagagcag 
cccccgctcc 
ttggcagcga 



attaagaggg 
acactatatg 
ccctggcctt 
tcttaagaca 
tgtacatgaa 
acaaaacaaa 
aaattatccc 
aactttgaat 
catgtttaca 
agtaggggga 
agaatgggat 
acctagagga 
gacaggtaac 
aaataaaata 
ggaacccttt 
acaagatgta 
taagaccatc 
tcagggagtg 
aaacagtaac 
tttcaactgt 
ctgttggaaa 
ttttttaggg 
gccagagcca 
gaaacaggag 
cccctcgtca 



gaaaaattag 
ctaaaacatc 
ttagagacat 
ggaacagagg 
aacatagagg 
agtcagcaaa 
atagtgcaga 
gcatgggtga 
gcattatcag 
catcaagcag 
agattacatc 
agtgacatag 
ccacctgtcc 
gtaagaatgt 
agagactatg 
aaaaattgga 
ttaaaggcat 
ggaggacctg 
ataatgatgc 
ggcaaggaag 
tgtggacaag 
aaaatttggc 
acagccccac 
tcgaaggaca 
caataa 



ataaatggga 
tagtatgggc 
cacaaggctg 
aacttaggtc 
tacgagacac 
aaacacagca 
atctccaagg 
aagtaataga 
aaggagccac 
ccatgcagat 
cagtccatgc 
caggaactac 
cagtgggaga 
atagccctgt 
tagatcggtt 
tgacagacac 
tgggaccagc 
gccacaaagc 
agagaagcaa 
ggcacatagc 
aaggacacca 
cttcccacaa 
cagcagagag 
gggaaccctt 



gaaaattagg 
aagcagagag 
taaacaaata 
attatacaac 
caaggaggcc 
ggcaaaagcg 
gcaaatggta 
ggagaaggct 
cccacaagat 
gttaaaagat 
agggcctgct 
tagtaccctt 
catctataaa 
cagcattttg 
ctttaaagtt 
cttgttgatc 
ggcttcatta 
aagagtgttg 
ttttaaagga 
cagaaattgc 
aatgaaagac 
ggggaggcca 
cttcaggttc 
aatttccctc 
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atgggtgcga 
ttaaggccag 
ttggaaaaat 
atgaaccagc 
acagtagcaa 
ttagataaga 
gctgacgaaa 
catcaagcca 
tttagcccag 
ttaaacacca 
accatcaatg 
gcaccaggcc 
caggaacaaa 
agatggataa 
gacataagac 
ttaagagctg 
caaaatgcga 
gaagaaatga 
gctgaggcaa 
ggtcctagaa 
tgcagggccc 
gactgtactg 
ccagggaatt 
ttcgaggaga 
ctcaaatcac 



gagcgtcaat 
ggggaaagaa 
ttgcacttaa 
tacaaccagc 
ctctctattg 
tagaggaaga 
aggtcagtca 
tatcacctag 
aggtaatacc 
tgttaaatac 
aggaggctgc 
aaatgagaga 
tagcatggat 
ttctggggtt 
aaggaccaaa 
aacaagctac 
acccagattg 
tgacagcatg 
tgagccaagc 
aaattattag 
ctaggaaaaa 
aaaggcaggc 
tccttcagaa 
caacccccac 
tctttggcag 



attaaaaggc 
acattatatg 
ccctggcctt 
tcttcagaca 
tgtacataaa 
acaaaacaaa 
aaattatcct 
aaccttgaat 
catgtttaca 
ggtgggggga 
agaatgggat 
accaagggga 
tacagctaac 
aaataaaata 
ggaacccttt 
acaagatgta 
taagaccatt 
tcagggagtg 
aaacaatgca 
atgtttcaac 
aggctgttgg 
taatttttta 
cagaccagag 
tccgaggcag 
cgacccctcg 



gaaaaattag 
ttaaaacaca 
ttagaaacag 
ggaacagagg 
aagatagatg 
agtcagcaaa 
atagtacaaa 
gcatgggtaa 
gcattatcag 
catcaagcag 
agattacatc 
agtgacatag 
ccacctattc 
gtgagaatgt 
agagactatg 
aaaaattgga 
ttaagagcat 
ggaggaccta 
gtcataatga 
tgtggtaagg 
aaatgtggaa 
gggaaaattt 
ccaacagccc 
gagtcgaaag 
tcacaataa 



atagatggga 
tagtatgggc 
cagaaggctg 
aacttaaatc 
tacgagacac 
aaacacagca 
atctccaagg 
aagtaataga 
aaggagccac 
ccatgcaaat 
cagtacatgc 
caggaactac 
cagtaggaga 
atagccctgt 
tagatcggtt 
tgacagacac 
taggaccagg 
gccacaaagc 
tgcagaaaag 
aagggcacat 
aggagggaca 
ggccttccca 
caccagcaga 
acagggaacc 



aagaattagg 
aagcagggag 
taatcaaata 
attattcaac 
caaggaagcc 
ggcaaaagcg 
gcaaatggta 
ggagaaggcc 
cccacaagat 
gttaaaagat 
ggggcctgtt 
tagtaccctt 
aatctataaa 
cagcattttg 
ctttaaaact 
cttgttggtc 
ggctacatta 
aagagttttg 
caattttaaa 
agccagaaac 
ccaaatgaaa 
caaggggagg 
gagcttcaag 
cttaacttcc 
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atgggtgcga gagcgtcaat attaagaggc 
ttaaggccag ggggaaagaa acactatatg 
ctggaaagat ttgcacttaa ccctggcctt 
ataagacagc tacaaccagc tcttcagaca 
acagtagcaa ctctctattg tgtacatgca 
ttagacagga tagaggaaga acagaaaaaa 
gctgacggga agatcagtca aaattatcct 
caccaggcca tatcac.ctag aactttgaat 
tttagcccag aagtaatacc catgtttaca 
ttaaacacca tgctaaatac agtgggggga 
accahcaatg aggaggctgc agaatgggac 
gcaccaggcc aaatgagaga accaagggga 
caggaacaaa tagcatggat gacaagtaac 
agatggataa ttctgggcct aaataaaata 
gacataaaac aaggaccaaa ggaacccttt 
ttaagagccg aacaagctac acaagatgta 
caaaatgcga acccagattg taagatcatt 
gaagaaatga tgacagcatg tcagggagtg 
gctgaggcaa tgagccaagc aaacagtaca 
ggccctaaaa gaaacattaa atgttttaac 
tacagggccc ctaggaaaaa aggttgttgg 
gactgtacag agagacaggc taatttttta 
ccagggaact tccttcagaa cagaacagag 
ttcgaggaga caaaccctgc tccgaagcag 
ctcaaatcac tctttggcag cgacccctcg 



ggaaaattag atacatggga aaaaattagg 
ctaaaacatc tagtatgggc aagcagggag 
ttagagacat cagaaggctg taaacaaata 
ggaacagagg aacttaaatc attatataac 
aagatagagg tacgagacac caaggaagcc 
tgtcagcaaa aaacacagca ggcaaaagag 
atagtgcaga atcttcaagg gcaaatggta 
gcatgggtaa aagtaataga ggagaaggct 
gcattatcag aaggagccac cccacaagat 
catcaagcag ccatgcaaat gttaaaagat 
agaatacatc cagtacatgc agggcctatt 
agtgacatag caggaactac tagtaccctt 
ccacctgttc cagtgggaga aatctataaa 
gtaagaatgt atagccctgt cagcattttg 
agagattatg tagatcggtt ctttaaaact 
aaaaattgga tgacagacac cttgttggtc 
ttaagaggat taggaccagg ggctacatta 
srgaggacctg gccacaaagc aagagtgttg 
aatataatga tgcagagagg caattttaaa 
tgtggcaagg aagggcacct agccagaaat 
aaatgtggaa aagaaggaca ccaaatgaaa 
gggaaaattt ggccttccca caagggaagg 
ccaacagccc caccagcaga gagcttcagg 
gagccgaaag acagggaacc cttaacttcc 
tcacaataa 
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atgggtgcga 
ttaaggccag 
ctggaaagat 
ataagacaac 
acagtagcaa 
ttagataaga 
gctgacggga 
caccaggcca 
tttagcccag 
ttaaacacca 
accatcaatg 
gcaccaggcc 
caggaacaaa 
agatggataa 
gacataaaac 
ttaagagccg 
caaaatgcga 
gaagaaatga 
gctgaggcaa 
ggccctaaaa 
tgcagggccc 
gactgtacag 
ccagggaact 
ttcgaggaga 
ctcaaatcac 



gagcgtcaat 
ggggaaagaa 
ttgcacttaa 
tacaaccagc 
ctctctattg 
tagaggaaga 
agatcagtca 
tatcacctag 
aagtaatacc 
tgctaaatac 
aggaggctgc 
aaatgagaga 
tagcatggat 
ttctgggcct 
aaggaccaaa 
aacaagctac 
acccagattg 
tgacagcatg 
tgagccaagc 
gaaacattaa 
ctaggaaaaa 
agagacaggc 
tccttcagaa 
caaaccctgc 
tctttggcag 



attaggaggc 
acactatatg 
ccctggcctt 
tcttcagaca 
tgtacatgca 
acagaaaaaa 
aaattatcct 
aactttgaat 
catgtttaca 
agtgggggga 
agaatgggac 
accaagggga 
gacaagtaac 
aaataaaata 
ggaacccttt 
acaagatgta 
taagatcatt 
tcagggagtg 
aaacagtaca 
atgttttaac 
gggttgttgg 
taatttttta 
ccgaacagag 
tccgaagcag 
cgacccctcg 



ggaaaattag 
ctaaaacatc 
ttagagacat 
ggaacagagg 
aagatagagg 
tgtcagcaaa 
atagtgcaga 
gcatgggtaa 
gcattatcag 
catcaagcag 
agaatacatc 
agtgacatag 
ccacctgttc 
gtaagaatgt 
agagattatg 
aaaaattgga 
ttaagaggat 
ggaggacctg 
aatataatga 
tgtggcaagg 
aaatgtggaa 
gggaaaattt 
ccaacagccc 
gagccgaaag 
tcacaataa 



atacatggga 
tagtatgggc 
cagaaggctg 
aacttaaatc 
tacgagacac 
aaacacagca 
atcttcaagg 
aagtaataga 
aaggagccac 
ccatgcaaat 
cagtacatgc 
caggaactac 
cagtgggaga 
atagccctgt 
tagaccggtt 
tgacagacac 
taggaccagg 
gccacaaagc 
tgcagagagg 
aagggcacct 
aagaaggaca 
ggccttccca 
caccagcaga 
acagggaacc 



aaaaattagg 
aagcagggag 
taaacaaata 
attatacaac 
caaggaagcc 
ggcaaaagag 
gcaaatggta 
ggagaaggct 
cccacaagat 
gttaaaagat 
agggcctatt 
tagtaccctt 
aatctataaa 
cagcattttg 
ctttaaaact 
cttgttggtc 
ggctacatta 
aagagtgttg 
caattttaaa 
agccagaaat 
ccaaatgaaa 
caagggaaga 
gagcttcagg 
cttaacttcc 



114/158 



wo 03/004620 



PCTAJS02/21420 



o\<n<noo\0(nON(no\o^at<AcnGDaio\a\ooa>(n(7i<n 




,Q .:v i , .V . ... . xD-- •'■•^. 

• : ' • • • • . >' • r. • O? 



a 
o 

U 




oooOtHoiro^invo 

iOOHrH«-»iHHiHi-l 
HHrHtHrHiHHrHiH 



03 (T\ O ^ ^ A 
H H 1-i (N 1-1 CN rO if) 
r-JHWrHfVrM<N 03 



OOOOOOOOOOOOOOT3 
HHHHHHHHHMHHHH Q> 

(ua>a}(U(U(i><i)<U(U(i)<i}<i}a)(u<>-' 

CnCOCOCOOlCOCOOlCflCQCOCOCQCOin 



tH <N m 
o o o 
p 



OOOHHHHHH 
— ^ — __ooooooooo fQ 

g 1:^1^1:^1:1:1: i: I: I: l;s 



C d* 
Q) 0) 0} 
CO U U3 

u *0 

a a a 

H H H 
I I I 

Tl in a» 
o o m 
a% o^ o^ 

H H H 
O O O 
M fO fO 



» 

0) D* tr 0" 

CO 0) 0) 0) 
^ CO 03 to 

— 

H 0\ O ,£! 
P O H 4J 
I " 



CO 



§ § § 

ua u> u> 
(n on 



tl! 



(0 D) 

\i 



115/158 



wo 03/004620 



PCT/US02/21420 



oooooooooooooomoooo%oooo 

(NfNfNrNCNCNJCNOgngcNCNrvJOJOarHOJOJCVJrHCNOJOJOJ 



IS h 



• ^ »:i »:i H Ei 



o 
o 



s s s • ts 



P S S 1:1 • »q h3 ' a 



w 



8 

•H 



Q V 
U U 



to • • . .a* . 'tnO 



« Q 



B 



Q Q 



Q O 

O U 

•rl I (d 

^ H 

U CQ p 

(d D) 

Q} U 



H 

8 



JH 



^ w . . . . --- -- . -. . £• 



■s 

1 

ON 

On 



* > 



S5 a SB S5 



Q Q Q • Q 



H 



:::::: :i :: :a 



'-^ iH rH H H 01 ( 



OOOOOOOHHHrHtHfH 
r-IHHHHHr-IHHr-IHHHI 



a)o)(i><U(l3<ua>(ua)a)0}a)a)<i} 

OOOOOOOOHHHHHH 

fooooopoooooop 



W CP 
H 0) 

^ H 

tn I 
q o 



$ 0) 
OQ CO 



t-l ri 



10 0 
o o\ 
en m 



CP 

0) CP 
to o 

^ ■ 

H 0% ( 
Q o I 
I I 

H O , 

03 n I 

U> VD I 
0\ 0> ( 



CP-- 



i5 



H _ 

I to 

O ^ 

CM O 

o S 



116/158 



wo 03/004620 



PCT/US02/21420 



CSt-lH<r»HHHHO>a>HHff»HHHHHHHHOO 

ooo(noooo(naiooo\oooooooooo 



* M « « « « 




: : : : fit: i :" :>^1 gg' 



0) 



ii 



' • • <y ' cy 



• • • K * cei ' ' ' ' 

• CO W ■ • • • CO 03 .CO 



• c • c c c c 



• pii oi K 



:^ : :§ 

. . . . g 

: « §. g 

CO (a M CO p< o 

: : S 



Ml •*>••■-: •• • •■■•■•■vftsisi-. •■•Si"*- •. • ■ i 



T I 




o 



Hi H 



.3 

Ti 



... 0 . .?( . . . .voi: 
. . . • .... .. . to 




is 

ON 
ON 



r4 CNJ n ^ ^ 

C3 <N CN rs} in u> 
, ca D" CT 0' cr 

S* '0 nc3 13 ^ ■ 
w CJ C CHmoji 

-'HHHQOtH4J 

in I I I I I I CO 
Hc^ro^inu)r«>a>oc»rn^oo(noooo\r-iOrHo^9 

l:^l:l:l;l;g|:l;|:l;|;l:|;s^^^^^^SgS 







00 


<n 


o 


H 


CN 


ro 




in 


\o 


0" 


00 


at 


o 


o 


o 


o 


H 


rH 


iH 


H 




i-C 


i-l 


t-l 






CN 


OJ 


<N 


OJ 


CM 




OJ 


OJ 


oa 


Ol 


OJ 


eg 


OJ 


CN 










G> 


cr 


CP 






ty 


cr 




t3* 


t3» 




Q) 






(L) 


Q) 


(U 








<u 


sr 


0) 


(U 






w 


Vi 


0 


CO 


u 




(0 


o 


CO 


to 


(0 


to 



' 00 © 
U to CN U 



117/158 



wo 03/004620 



PCTAJS02/21420 



OOOONOOOOOOOOOO 
HHHHfHHHiHHHiHt-IWHH 



t*> 

o o o 

H H H 



: -.H : : : : : ;M : :« : I 



5 ^ 



t a * ' 

I • • 05 K • • • • K 



04 • • a< 



PS ; 
* * * * 



Z 2 ; 

•"is 



I « « « « « « 

: si : : : a : 



* * * 



: g : : ^ 
• • « • ■ 



I Q I • 
! • ! 3 



i.-:;<. ,: 



en 



6^* 







—ri 






1 


ri 


te I:| pij 






a 


IP' 


6 


jw; loj 






ii 


N 




i'':^ W 


' ■ 'i'^ 










K.* M; 


1 • vJ 


- : .hi 





SI 

Ii 

M 
Qi 



- • • 

' b b 



• bki b 



WWW 'W 'W 'W 



> 

b 

ON 



u at 
















>1 : : : : : : 






^ ^ ^ 



: : :a : : 



CNfNrgojrsjcNCNoJ 



^ 0) 



C W 

1 g 



O 



Q)®Q)<uoo)a><unioM(n«n<Nra'^ 



n m CN at 

01 O (Q O 



^ I • 



I ^ o 

> C7\ H 



0) n 

CO \D 



o 3 w 
u> 03 m 

o o> ( 
CO CM n i 
^ o ^ : 



CI 3 n 
m « \n 

i S'S ? 

(0 O CO 
I w O w 



4J 

<d 0 

CQ U 

'H Id 

U 44 

0) M 

4J 



118/158 



wo 03/004620 



PCT/US02/21420 



73 
c 



o o o o ooooooooooooooooooooo 



o (y 



2" 



Q 



0^ 

to « 



a 
o 



d" a 



99 



M H H H H 



s 6 



Q Q 



> , 



H H 
I I 
I I 



Q O Q 



^< t!< 



Q O Q 



8. 



*^ 6 



04 & 

5 u u u 



1^ s 



H H H 



o 

1 

8 



^ ^ \A 



^ ^ ^ in 

r- «n H 

^ S'H g- 

» w o 

Qi U« « 

a rg n CO 

o — 

00 » o in 

»^ ui cs rj 

o o ca o 

e i P « 

» VO S cx) 

H o> H o> 



^ H ^ 



00 cn o t-i . 
cq oa n n < 

H rH iH r^ fH f 

O* 1 

W QJ tJ* » W 

$ n Q> 0) Q) 

U 10 to CO 



3 y 



• r-l N H 

• Ci • « • 

GO H n N 0\ 

O U V u o 

H C9 CM C9 W 



in 



« « 
00 o% 

P; 10 w 
w w 



0\ A 

u u 

o o 

o o 



u u 
o 

O H 

o o 



O r-l ^ ^ in w> 
^ H ^ ^ 

" " 5? y 8* " " 

n 10 w ^ — 
»- 5 o o 
m « a V 



N O 

n 

pppoppppopooppopooooo 

i i 1 1: 1: 1: 1: 1: 1: g g I: I: I: I: I: I: I: I: I: E 



h u> ^ 
o u o 
» ^ ^ 

H H W 

ill 



^ m H u I 

O U D D U 

00 OO Ok ^ 

o o H o o 

o p o - 



119/158 



wo 0J/(MM62() 



PCT/US02/21420 



oj ^ m ^ 

00 00 00 



a»aioot>p^r«cooooooo[^coa\a>coooo^r«'coo>o> 

rHTHHHrHHHHHHHHHHHHHHHHW 



> H a 



IS s 



t>d ^ x 



CO 

o 

CO • o in 

(x> in <N oj 

o o cs o 

rH IS CN frj 

(N PQ S pq 

« Eh oj 

H a\ H at 



u pa H u 



I 1 I I I I I I I I • 

I I O O i I i I 



1 I I I » » □ p 
I t CO I CA (A S Q 



I I 
I I 
I I 



a w w 

pi p! 















M H H » 




P £ 






H ' * H 


H 












H 


H 


H 


H 


H 


H 


H 






H 












<s 


H 




H 


ca 






(N 






N 








<r» 




00 


CO 


H 






a\ 


H 




0 


U 


U 


U 


u 


U 


U 


U 


H 


H 


N 


CJ 


M 


M> 


u> 


\o 


o 


O 


O 


H 


H 


o 


o 


o 



o o o o 



OS 

o 
o 



E 



120/158 



wo 03/004620 



PCT/US02/21420 



^vomvo inir>Ha)oomcnin<Door^O'#'«i«^oorovoifjVDVD 

CN CN CN rvj CNC\CNCNCNrNCNCNrJrSlCNCMCNCNCN(NCNCNt>3CQt>J 



a a 



6^MH. .,.HMHHH.*..HW»..H... 

ao 



• Q WW 



Q a Q 



: ! i « ; ! : : : : : 



0} 



» U H 



> > 



> H 



. . . . W M . 
CQ 03 CO 03 CQ CO CQ 



COCOCQ CQCQ09CQC0 



JHHH HHHHHHHH • -HH • •>>SHaHH 



o 
\ 
I 
I 

<C0 
<C0 



Z I EH 



CN 00 
O 

00 • o 

vo m o) c>3 

O O CN o 

«H a ci 

« ?3 « 

a VD g <>j 

H 0% 0> 



1^ « o 

I I I 

I I ui 

I I g 

I I Q 



jiiiKwe^HSoiii 
I I I I I I I I I I 



» a u &4 pe: 



^ in 

WraH 03 o o o CO Tji (V) ^ 

0\ • r< »^ t*» 



O O 

\o o o 

N m H M 



uooooooooouuoooooiJdBiS 

OOoHi-IOOOOOH^HHOOHOOHH 

oooooooooooooooooooo 

l:gg|:r:g^i:g|:l:g|:g|: 1: 1:1:11 



121/1S8 



wo 03/004620 



PCTAJS02/21420 



OQ O <7> O 

a\ a\ CO <n 
m fo n CO 



<ncr»voc^cNr*-r)o><NC>3H'^ooooin(Nr-oo>oooo 
o\Q\cocoooooo\aoa\a\oo<T\<r\o\<rkO\a\voai>o\o\ 
rnrarnrornrnrnrnroforororornrornrornrnrnro 



A<EH''' •'•MMtfJWW 



<w Ei ^ 

J§ Q ! ! !!!!!!!;!! 1 !!! I ! : 
*p 



o o o 



o o 



s s s 



w 2 



H W b5 « 



^ S K 



0.^ 



Da 



I I a 1 I I I I I I I r r I I* I* I* 1 r r i' 



u pq 



cd ■ < a: Qi m o <y • '• * • • - ci * 'PQwaiBpqKi 

§ EhEh 



> > 



: : : : : g g 

b [X4 |C4 • • • > 



a: < 



H > 



>>>as . • . . .>s>> 
• . i i di ci d pt! pi ' pi • • 



PC Qc: 



> I • • • . • i i ! ! pi fil i ! ! ! ! i ! ! ! ! 

> s ^ ^ H ^. ^! ^. ^. > ^ H ^! H ^! : > ; > « > > 

• • • 'H • • • "HMH • • 'HUH 'M** 
Q • 



CN 00 

o 

00 • o in 

KD in C9 

o o o 

tH 3: <N Qi 

CN ro E ffi 

S »^ o: 

H 0% W 0^ 



H ^ in 

can 'HcQH cfloo 
• • eg • • • m • H H 

uuuuoouoouu 

OOOHrHOOOOOrH 

-opoooooooo 



CI 

o m 

H • 

u u 

00 ^ 



u o 



o 

^ \o 

• rq 

^ in H 

u O 0 

00 Ol CO 

^ " o 



o 

o o 

n H ra 

^ (n n 

o H H 



p o o o 



0\ 

o 

1 

o 
o 



122/158 



wo 03/004620 



PCT/US02/21420 



o> in rH m 

00 t*- 

^ ^ ^ ^ 



u>oomini/)OU)<>3oocD^cn^^«jiHoo«ct<^md> 

' K)^ ^1 ^1 ^1 ^1 ^1 «^ «^ ^1 ^Jl ^Jl 







i 




§ • 




o • 




o • 




H 

Dq . 

1 1 


H H 


1 1 
e • 





<^ Q H Q 



CO 

I I 

I I 

Qj CP 

U Eh 



O 
I 
I 

< H 

< &d 

W 
M 



i 



H 



g ^ 

CO H 
<W CD 
I 



5 S 



Of 



oo 



Cl) 



Q S 



I 



H EH 



!3 IS 



a a 



<i ci ci o oi 
»j vj 



H H 

I I 

I I 

Q • 

Eh W 

I 
I 



t« H 



Eh EH 

H H H 

Oi a* (xi 

(0 iJ U 

CO CO ♦ 

CO CO s 



Eh 
H M 
> < 
I > 
• CO 
Eh O 



CD CD 



> > 



H H 

i I 

CO CO 

O O 



I I 

B E^ 



t» Eh 

I 1 

I 1 

I I 



e« tH Eh 

H H H 

CO CO CO 

CO CO CO 



H H H H H 
111)1 

6 6 CD e^ en 

) I I I I 

I I I I I 

CO • ♦ I I 

W P Q W • 

CO CD CD 

I Eh H CO CO 

H ^ ^ ^ !Zi 

I I I I 



(Di W W Eh 



Eh S 



gw 



H > 



25 S 



t^ H e 

W H H 

S S 1 

i I I 

• • 1 

> > I 



O CD 



HeHH I e»eEHEH 

WHHH 1 HH»J»4 
i I I I I I H I I 
I i I I i I I I I 
CO • . 

tOCOWWlJuCOPOO 



S CD H H 



ti Eh 



1^ 2 S 



03 



!< rtj ri: 



I I I I I I I I I I I I I I I 

I i I I I I I I I I I I I I I 

I ( i 1 I I I { CO CO I I I I I 

I icococMicocoatOjCoi i i i 




03 I I Pm (II Pc, 

J« JH G I I Q 

w 

O O CD CD CD 



Ul IQCDCDMWQWaOICDS>Ht< 



03 CO 

EhEh>>CDCD03CQOOOW01 




I I Eh 0) 0} I 



1 i I I 
I I Eh I 
23 W CD I 



03 



I I 
I I 



(III 
I I I I 
I I I I 
I I I I 

ill! 



1 I I fZ4 Cl4 I 
t I I ^I] 1^ I 

I I I 1 I CDt I 



lilt 
III 



bt to f£\ 



H H 



CM 



o o 



o 



00 CO H M (N 0\ 

U U O 0 u u 

H H M N OJ \0 

O O O H H O 



• W H |v H 

u> vo o " 
o o o H 



<n 



CM 
O 

00 • O ID 

VD in CN OJ 

O O CM O 

^ S CM _ 

(NfflfflW PPPPPPPPPPpooooooooo 



o o 

^ o r» o 
• ... ^a ro H c« 

OOOOUlUDcSo 

-•OOHOOHH 



O 

CO. 



123/158 



wo 03/004620 



PCTAJS02/21420 



oi 00 00 o\cnr^voM>r^cx>r>cx)covoo>a)GoooGOo^vor^ooco 
inminm inininininmLninininininmininirjininininin 



fx: . 

3^ 



•J 

H 



a: < a: ' ' a: < < ♦ ' < < 



< < 



H 



> > 



i4 Ui 



o • 
o • 



H 



S 



«; <: < 



o 

CM 
r-* 
Q. 



O 



H . J 
I 



g : 



• ass 



^ g E3 -i 



a a 



< < 



On 

• • • • 

O 

g : : : !!!!:!!!::;;;;:; ^ 

1 ; ^ ; ^ 

I : : : ::::::::::::::::::::: g 

^••H HH'-'HHHHHH.HHHHHH.HH 2 

cu ! ! ! i ! i i * ! i !!! i i !!!!!! ! S 

>HHH HHHHHHHHHHHHHHHHHHHHH P 



CN 00 

o 

00 . o u> 

VD in rsj 

o o o 

CQ H m 

a vo e oj 

H at H o> 



C4 H . rt n H 

• • C3 • • • 

00 oo H rj <s o\ 

U U O U U U 

H H N d rs 

O O O H H O 

O O O Q Q 



(s o o o m 

<n t H H tN H • • 
OUUDUUUO 



^ m ^ 



o 

, , m W 
u u o u 



\oior^r<*oooTf^a>ao<n(n 

OOOOHHHHOOWO 

poooooooooo ~ 



124/158 



o 

o t> o 

<n H C» 




wo 03/004620 



PCT/US02/21420 



oo^cn^ Lnr^c^rorncrkLnHc^r^"^r^rnrornor^rorncoco 
o>oor*ooc o^a^oo^^I^^*ooc30coaoc^ooa^a^o^o^o^^*'COooco 

U>VCVOU>VaU>VOU>VOVX3lOlOlOU3VDVOlOVOlOVOVO 



Ha 

M 

fa 



IS 

w 
Q 



u 
a 

a 
w 



H 

a: 















CO 




















w 




ci 






CO 






















ci 


w 


di 






a' 
















H 




pi 














H 


fx] 
















Ed 



Sow 



I 

w 



.E W 



H 

s 

o 



1 



CV} 00 

o 

00 • o in 

vo in o3 rq 

o o cvj o 

S CN3 05 

CN a X CQ 

2 U3 e-J oj 

w o^ W o\ 



Q Q 

3 w 



Q Q 
I* I 



00 00 

o o 

H H 

o o 

o o 

I; I: 



di a o 55 



O CD 



CO 



CO « 



H H 
> > 

Z !3 



H H 

Q Q 



CO CO 



» H M !3 



Q a 



a a 



oj o o» a » 



&^ P § P 8 S § 

. . ,j h5 »J a 

S S ^ S 2 2 

K PC 

12; 



Hi ^ 



E-« & a 



I 



ON 

o 
o 



H M H 

eg • * • 

H cs <n 

u u u 0 

M M u> 

O H tH O 

o p p p 



H <n n 0) 
U U D u 



vf> 

o o o 
o o 



o m 

H . 

U O 

00 TP 



g g ^ g g g g g i g g g 

125/158 



0 O 
^ 00 
tH O 



Tii in 
0 u 

O rf 



o o 

\o o 

u o 

U 0 

o o 



o 
u u 

ii 



o 



wo 03/0(14620 



PCT/US02/21420 



ro oa ^ a> 

O 00 00 00 

00 



1^ >J 






CN CO 

o 

• o in 

VI5 i/i rs 

O O CN O 

H S 03 C3c: 

CN ffl W ffl 



H 0\ 



or<4t^oooo^oiocscN(rt(NHHooLninoooomro 
oooor^>oo(nco9\atr^a^o>o%o>cho\r^a>mch 
oooor^t^c^t^r^t^r*r*>r^r^r*r^r*r^r»r*r>'r*r*» 























a: 

H • 






M 


ci a 
< < 



CO w 



5 ^ 



H 





> > > 




> f> > 


> 










H H 




> 

H 


H H O 




> 
H 


H 


^5 


> 

H 



CO w 



CO CO CO 



< < < 



»J 



< H H 



»J 
CO 



CO CO 



I I 
I I 
1 I 
I I 
1 I 
I I 
I I 



CO CO 



a: 



\ 
t 
\ 

I 
I 

J CO 



Ol 



CO CO 10 



Ol a 



OS 

CO CO CO 



> > 



o 

00 

I 

CO 

V ' 

o 
o 



i 



^ in CN 
CN o o o m ^ m ^ 

gOOOOOOOOUOOOOD 

cN(N(NU>\oi£)t^rsooo^'^j!a)coo\ 

OHHOOOOOiHHHrHOOH 

ppooooooo 



M 

00 00 

H H CN IN rM 
O O O H H O 
O O O O O p 



O O 

\o o 

- o 
U 

«*) ^ 
o o 



o 



H H 



U I: I: i: i: ^ I; 1: 1; g g I: I; I: I: g 

126/158 



wo 03/004620 



PCTAJS02/21420 



O 0\ H VO 

r- in m 

00 00 GO 00 



vDvoin^^ir>inu>ooTj»ininin\DVDvo^invovo 

OOOOOOOOCOOOOOOOCOOOOOOOCOGOOOOOCOCOCOOOCO 



I 





U 


U 








>^ 










1 






















































sj 








o 


d 




3: 


3: 1 


1 6 






CD 






i 


6 


d 








! o 


H 


H 


H 






H 






H 


H 


H r 


1 H 






> 








Cm 












H 


H 


H 


H 


H 




hi 


H 


H 


H 


H 1 


1 hi 



CO 



^ H 



^ CO 



H H H 



> 



< < < 



H H 



o 

H H 

pci hi 



CO CO 



&4 ^ 



hi hi 

6 

Eh 
< Eh 
(O CO 



^ ^ ^ 



H H 

d 

Eh 
CO 



> H 



CO CO 



is: ^ 



CO 



M H 



o o 

H 



• H C« H OJ 

• • • • o\ • 

uououuuuu 

H CI d C3 ^ U7 U> 
O O H H O O O 

o o o o o o o 



O CO 



^ n 



f>3 
O 

00 • o in 

vo in (N c\j 

O O Cvj O 

^ § s ^ 



o o 
o o 



o o 

u) o r« o 

CJ fO H 

uouuoouuuo 

cx> — — — 



HHHOOHOOHri 

ppppppppoo 



127/158 



wo 03/004620 



PCT/US02/21420 



03000000 oDcoaooooocoooa3ooQoa>oor-r"ODa>t» 



ooaaoooooooocoQocDco 

HrHiHrHrHp-tHHr-lr-I 
rHi-li-lf-*HfH»-lfHHtH 



0) 
OA 




T3 -O "O V 
C R C 0\ o jC 
H W Q o H ti 



11 I Isnlsnhhilsimi- 
iiiiiiiliiililiiiilii?«35§iii§ 



CMOOOU)S(0{v,i 



i o 



128/158 



wo 03/004620 



A (continued) 



120 



140 



160 



TV007-6 
TV007-2 
TV019-82 
TV019~85 

arvoo8-l7 

TV008-1 

TVO 14-25 

TV014-31 

TV004-45 

TVOOl-2 

TV018-7 

TV018-8 

TV002-B4 

TV009-3 

TV013-Z 

TV013-3 

TV003-X2 

TV003-B 

TV005-81 

TV012-4 

rV006-9 

TVOlO-25 

g2BR025 

301904- Ind 

301905- Ind 
301999-Ind 
96B;416-D14 
96BW04-09 
96BW12-10 
C2220-Eth 
HXB2 

Consensus 



QV. 

QV. 



H. 



:: 



.. .Q..S. 
. ...R.SS 



.Q..S. 



YR.S 

h S 

AIBKAILGHIVZPR] 

critical cysteine 



.RR. 



. .R. . 
.RR. . 
. .R. . 
. .R. . 
. .R. , 
. .R. . 
. .R. . 
.R. . . 



.RQ. 



B 

DYQAGHNKVG 



PCTAJS02/21420 

180 * 



. ..A. . 
...A.. 
...A.. 
...A.. 



.N 

DH. 

DH. 

.S.R. . .DH. 
.S.R...DU. 



IK. 



• N... 



TALIBPKKXR9PLI 



S K 

T..T K.H..S. 

;KX.VEDRWNKPQKTRGRRGNH|MN6R 



phosphocylation sites 



192 
192 
192 
192 
192 
192 
192 
192 
192 
192 
192 
192 
192 
192 
192 
192 
191 
191 
192 
192 
192 
192 
192 
192 
192 
192 
192 
192 
192 
192 
192 




129/158 



wo 03/004620 



PCT/US02/21420 



Figure 101 



TV007-6 


: , . N . ' -QT , 1 . . 


TV007-2 


: ,.N.'-QT.I.. 


TV019-82 


.V. .K. . . 


TVO 19-65 


.V. ."T.G. , . 


TVOOB-17 


. .N.— A 


TVOOB-1 


. .N. —A 


TV014'25 




TV014-31 


.... L. . 


TV00«-4b 


. .S.--I 


•IVOOl-2 


.VS.S.VKCI . . 


^v0le-7 




TVO 1 8 - fi 




'rV002 - 84 






. 1 N , - - 2 


TV013-2 


. ,K.--.T 


TV013'3 


. .K.--.T 


TV003-12 


. . ,G. . . 


TV003-B 


. . .E. . . 


TV005-ei 


. .N.--.EO. . . 


TV006-9 




TV012-4 


. .NF--.BK. . . 


TVOlO-25 




92BR025 


, .E.--IG.I. . 


303904-Ind 




301905-Ind 




301999-Ind : 


II. . 


96BW16-D14 


. FS.--IEK. . . 


96BW04-O9 


. .S.-— .AI. - 


96BW12- JO 


. . y. .EK. . J 


C;2?20-Kth : 


. V. . . .K. . . 


MXB2 


T--QPIPJV--- 


Consensus : 


MLDI.Nl.LARVDYl 




o-helix transmembrane domain 




:aEYRKLVRORKIDW] 



phosphorylation sites 



signature motif 
> LRLL 



C'termlnal domain, interaction with CD4 



130/158 



wo 03/004620 



PCTAJS02/21420 



FIGURE 102 (SEQ ID N0:181) 
Sheet! OF 2 

3'half#8_2_TV1_C.ZA 

GTCGACTGTAGTCCAGGAATATGGCAATTAGATTGTACACATTTAGAAGGAAAAATCATCCT 

GGTAGCAGTCCATGTAGCTAGTGGCTACATAGAGGCAGAGGTTATCCCAGCAGAAACAGG 

ACAAGAAACAGCATATTTTATATTAAAATTAGCAGGAAGATGGCCAGTCAAGGTAATACATA 

CAGACAATGGCAGTAATTTTACCAGTGCTGCAGTTAAGGCAGCCTGTTGGTGGGCAGGTAT 

CCAAGAGGAATTTGGAATTCCCTACAATCCCCAAAGTCAGGGAGTGGTAGAATCCATGAAT 

AAAGAATTAAAGAAAATAATAGGA CAAGT AAGAGATCAAGCTGAGCACCTTAGGACAGCAG 

TACAAATGGCAGTATTCATTCACAATTTTAAAAGAAAAGGGGGGATTGGGGGGTACAGTGC 

AGGGGAAAGAATAAT AGAC ATAATAGCAACAGACATACAAACTAAAGAATTACAAAAACAAA 

TTATAAAAATTCAAAATTTTCGGGTrTATTACAGAGACAGCAGAGACCCTATTTGGAAAGGA 

CCAGCGAAACTACTCTGGAAAGGTGAAGGGGGAGTAGTAATAGAAGATAAAGGTGACATAA 

AGGTAGTACCAAGGAGGAAAGGAAAAATCATTAGAGATTATGGAAAACAGATGGCAGGTGG 

TGATTGTGTGGCAGGTGGACAGGATGAAGATTAGAGCATGGAATAGTTTAGTAAAGCACCA 

TATGTATATATGAAGGAGAGCTAGTGGATGGTCOTACAAACATCATTTTGAAAGCAGACATC 

GAAAAGTAAGTTCAGAAGTACATATCCCATTAGGGGATGCTAGATTAGTAATAAAAACATAT 

TGGGGTTTGCAGACAGGAGAAAGAGATTGGCATTTGGGTCATGGAGTCTCCATAGAATGG 

AGACTGAGAGAA TATA GCACACAAGTAGAGCCTGGGCTGGCAGACCAGCTAATTCATATGC 

AmTTTTGATTGTTTTACAGAATCTGCCATAAGACAAGCAATATTAGGACACATAGTTATCC 

CTAGGTGTGACTATCAAGCAGGACATAAGAAGGTAGGATCTCTACAATACTTGGCACTGAC 

AGCATTGATAAAACCAAAAAGGAGAAAGCCACCTCTGCCTAGTGTTAGGAAATTAGTAGAG 

GATAGATGGAACGACCCCCAGAAGACCAGGGGCCGCAGAGGGAACCATACAATGAATGG 

ACACTAGAGATTCTAGAAGAACTCAAGCAGGAAGCTGTCAGACACTTTCCTAGACCATGGC 

TCCATAACTTATGAAACCTATGGGGATACTTGGACGGGAGTTGAAGCTATAATAAGAGTAC 

TGCAACAACTACTGTTCATTCATTTCAGAATTGGATGCCAACATAGCAGAATAGGCATnTG 

CAACAGAGAAGAGCAAGAAATGGAGCCAGTAGATCCTAAACTAGAGCCCTGGAACCATCC 

AGGAAGCCAAGCTAAAACTGCTTGTAATAATTGCTTTTGCAAACACTGTAGCTATCATTGTC 

TAGTTTGCTTTCAGACAAAAGGCTTAGGCATTTCCTATGGCAGGAAGAAGCGGAGACAGCG 

ACGAAGCGCTCGTCCAAGTGGTGAAGATCATCAAAATCCTCTATGAAAGCAGTAAGTACTC 

ATAGTAGATGTAATGGTAAGTTTAAGTTTAGATAAAGGAATAGATTATAGATTAGGAGTAGG 

AGCATTAATAGTAGCACTAATGATAGCAATAATAGTGTGGACCATAGTATATATAGAATATAA 

GGAAATTGGTAAGACAAAAGAAAATAGACTGGTTAATTAAAAGAATTAGGGAAAGAGCAGA 

AGACAGTGGCAATGAGAGTGATGGGGACAGAGAAGAATTGTGAACAATGGTGGATATGGG 

GCATGTTAGGCTTGTGGATGCTAATGATTTGTAACACGGAGGACTTGTGGGTGACAGTGTA 

GTATGGGGTACGTGTGTGGAGAGACGCAAAAACTACTCTATTCTGTGCATGAGATGCTAAA 

GCATATGAGAGAGAAGTGCATAATGTCTGGGCTACACATGCCTGTGTACCCACAGACCCCA 

ACCCACAAGAAATAGTTTTGGGAAATGTAACAGAAAATTTTAATATGTGGAAAAATGACATG 

GCAGATCAGATGCATGAGGATGTAATCAGTTTATGGGATCAAAGCGTAAAGCCATGTGTAA 

AGTTGACCGGACTCTGTGTCACTTTAAACTGTACAGATAGAAATGTTACAGGTAATAGAACT 

GTTAGAGGTAATAGTAGGAATAATAGAAATGGTACAGGTATTTATAAGATTGAAGAAATGAA 

AAATTGGTGTTTGAATGGAACCACAGAATTAAGAGATAAGAAACATAAAGAQTATGGACTGT 

TTTATAGACTTGATATAGTACCAGTTAATGAGAATAGTGAGAAGTTTAGATATAGATTAATAA 

ATTGCAATACGTCAAGGATAACAGAAGCCTGTCCAAAGGTGTGTTTTGACGGGATTGGTATA 

GATTAGTGTGGTCGAGGTGGTTATGCGATTGTAAAGTGTAATAATAAGAGATTCAATGGGAC 

AGGAGGATGTTATAATGTCAGGAGAGTACAATGTACACATGGAATTAAGCGAGTGGTATGA 

ACTCAATTACTGTTAAATGGTAGTCTAGCAGAAGAAGGGATAATAATTAGATCTGAAAATTT 

GACAGAGAATACCAAAAGAATAATAGTACAGCTTAATGAATGTGTAGAGATTAATTGTACAA 

GACCGAAGAATAATAGAAGAAAAAGTGTAAGGATAGGACCAGGACAAGCATTGTATGCAAC 
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AAATGATGTAATAGGAAACATAAGACAAGCACATTQTAACATTAQTACAGATAGATGGAACA 

AAACTTTACAACAGGTAATGAAAAAATTAGGAGAGCATTTCCCTAATAAAACAATACAATTTA 

AACCACATGCAGGAGGGGATCTAGAAATTACAATGCATAGCTTTAATTGTAGAGGAGAATT 

TTTCTATTGTAATACATGAAAGCTGTTTAATAGCACATAGCACTCTAATAATGGTACATACAA 

ATACAATGGTAATTCAAGCTCACCCATCACACTCCAATGTAAAATAAAACAAATTGTACGCA 

TGTGGCAAGGGGTAGQACAAGCAAOGTATQCCCCTCCCATTQCAGQAAACATAACATQTA 

GATCAAACATCACAGGAATACTATTGACAGGTGATGGAGGATTTAACACCACAAACAACAC 

AGAGACATTCAGACCTGGAGGAGGAGATATGAGGGATAACTGGAGAAGTGAATTATATAAA 

TATAAAGTAGTAGAAATTAAGCCATrGGGAATAGCACGCACTAAGGCAAAAAGAAGAGTGG 

TGCAGAGAGAAAAAAGAGCAGTGGGAATAGGAGCTGTGTTCCTTGGGTTCTTGGGAGCAG 

CAGGAAGCACTATGGGCGCAGCGTCAATAACGCTGACGGTACAGGCCAGACAACTGTTGT 

CTGGTATAGTGCAACAGCAAAGCAATTTGCTGAAGGCTATAGAGGCGCAACAGCATATGTT 

GCAACTCACAGTCTGGGGCATTAAGCAGCTCCAGGCGAGAGTCCTGGOTATAGAAAGATA 

CGTAAAGGATCAACAGOTCCTAGGGATrTGGGGGTGCTCTGGAAGACTCATGTGCACCACT 

GCTGTGCCTTGGAACTCCAGTTGGAGTAATAAATCTGAAAAAGATATTTGGGATAACATGA 

CTTGGATGCAGTGGGATAGAGAAATTAGTAATTACACAGGCTTAATATACAATTTGCTTGAA 

GACTCGCAAAACCAGCAGGAAAAGAATGAAAAAGATTTATTAGAATTGGACAAGTGGAACA 

ATCTGTGGAATTGGTTTGACATATCAAACTGGCCGTGGTATATAAAAATATTCATAATGATA 

GTAGGAGGCTTGATAGGmAAGAATAATTTTTGCTGTGCTTTCTATAGTGAATAGAGTTAG 

GCAGGGATACTCACCTTTGTCATTTCAGACCCTTACCCCAAGCCCGAGGGGACTCGACAG 

GCTCGGAGGAATCGAAGAAGAAGGTGGAGAGCAAGACAGAGACAGATCCATACGATTGGT 

GAGCGGATTCTTGTCGCTTGGCTGGGACGATCTGCGGAACCTGTGCCTCTTCAGCTACCA 

CCGCTTGAGAGACTTCATATTAATTGCAGTGAGGGCAGTGGAACTTCTGGGACACAGCAGT 

CTCAGGGGACTACAGAGGGGGTGGGAAATCCTTAAGTATCTGGGAAGTCTTGTGCAATATT 

GGGGTCTAGAGCTAAAAAAGAGTGCTATTAGTCTGCTTGATACCATAGCAATAACAGTAGC 

TGAAGGAACAGATAGGATTATAGAATTAGTACAAAGAATTTGTAGAGCTATCCTCAACATAC 

CTAGAAGAATAAGACAGGGCTTTGAAGCAGCTTTGCTATAAAATGGGGGGCAAGTGGTCAA 

AATGCAGCGGATGGCCTGCAGTAAGAGAAAGAATGAGACGAGCTGAGCCAGGAGCAGAG 

GGAGTAGGACCAGGGTCTCAAGACTTAGATAGACATGGGGCACTTACAAGCAGCAACACA 

CCTGCCAATAATGATGCTTGTGCCTGGCTGCAAGCACAGGAGGAGGACGGAGATGTAGGC 

TTTCGAGTCAGACGTCAGGTACCTTTAAGACCAATGACTTATAAGAGCGCATTCGATCTCAG 

CTTCTTTTTAAAAGAAAAGGGGGGACTGGATGGGTTAGTTTACTCTAAGAAAAGGGAAGAA 

ATCCTTGATTTGTGGGTCTATAACAGACAAGGCTTGTTCCGTGATTGGCAAAACTAGAGAGC 

GGGGCCAGGGGTCAGATATCCACTGACCTTTGGATGGTGCTACAAGGTAGTGCCAGTTGA 

CCCAGGGGAGGTGGAAGAGGCCAAGGGAGGAGAAGACAAGTGTTTGCTAGACGGTATGA 

GCCAACATGGAGCAGAGGATGAAGATAGAGAAGTATTAAAGTGGAAGTTTGACAGTGTCCT 

AGCAGGGAGAGACATGGCGCGCGAGCTAGATCCGGAGTATTACAAAGACTGCTGACACAG 

AAGGGAGTTTGGGCCTGGGACTTTCCAGTGGGGCGTTCCGGGAGGTGTGGTCTGGGCGG 

GACTTGGGAGTGGTGAACCCTCAGATGCTGCATATAAGCAGCTGCTTTTCGCTTGTACTGG 

GTCTGTCTCGGTAGACGAGATCTGAGCCTGGGAGCTCTCTGGCTATCTAGGGAACCCAGT 

GCTTAAGGCTCAATAAAGCTTGCCTTGAGTGCTTTAAGTAGTGTGTGCCCGTCTGTTGTQT 

GAGTGTGGTAACTAGAGATCCCTCAGACCCTTTGTGQTAQTGTGGAAAATCTCTAGCAGCG 
GCCGC 
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FIGURE 103 (SEQ ID NO:182) 
(Sheet 1 of 5) 

Full#2_l/4_TV12_C_ZA 

TGGAAGGGTTAATTTACTCTAATAAAAGGCAAGAGATCCTTGATTTGTGG 

GTTTATAACACACAAGGOTCITCCCTGATTGGCAAAACTACACACCGGG 

GCCAGGGGTCAGATATCCACTGACCnTGGATGGTGCTACAAGCTAGAGC 

CAGTCGATCCAAAGGAAGTAGAAGAGGCCAATGAAGGAGAAAACAACTG 

TTTACTACACCCTATGAGCCAGCATGGGATGGAGGATGAAGACAGAGAAG 

TATTAAGATGGAAGnTGACAGTATGCTAGCACGCAGACACATGGCCCGC 

GAGCTACATCCGGAGTATTACAAGGACTGCTGACACAGAAGGGACTTTCC 

GCTGGGACTTTCCACTGGGGCGTTCCAGGAGGTGTGGTCTGGGCGGGACT 

GGGGAGTGGTCAGCCCTGAGATGCTGCATATAAGCAGCTGCTTTTCGCCT 

GTACTGGGTCTCTCTAGGTAGACCAGATCTGAGCCCGGGAGCTCTCTGGCT 

ATCTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCCTT 

GAGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACTAGAGATCCCTCA 

GACCACTTGTGGTGTGTGGAAAATCTCTAGCAGTGGCGCCTGAACAGGGA 

CTTGAAAGCGAAAGTAAGACCAGAGGAGATCTCTCGACGCAGGACTCGG 

CTTGCTGAAGTGCACTCGGCAAGAGGCGAGAGAGGCGGCTGGTGAGTAC 

GCCAAATTTTATTTGACTAGCGGAGGCTAGAAGGAGAGAGATGGGTGCGA 

GAGCGTCAGTATTGAAAGGGAAAAAATTAGATACATGGGAAAGAATTAG 

GTTAAGGCCAGGGGGAAAGAAACACTATATGCTAAAACACCTAGTATGG 

GCAAGCAGGGAGCTGGAAAGATTTGCACTTAACCCTGGCCmTAGAAAC 

AGCAGAAGGCTGTAAACAAATAATGCAACAGCTACAATCAGCTCITCAGA 

CAGGAACAGAGGAACTTAGATCATTATATAACACAGTAGCAACTCTCTAT 

TGTGTACATAAAGAGATAGATGTACGAGACACCAAGGAAGCCTTAGACA 

AGATAGAGGAAGAACAAAATAAGAGTCAGCAAAAAACACAGCAAGCAG 

AAGCGGCTGACAAAGGAAAGGTCAGTCAAAATTATCCAATAGTGCAGAA 

TCTCCAAGGGCAAATGGTACACCAGGCCATATCACCGAGAACTTTAAATG 

CATGGGTAAAAGTAATAGAAGAGAAGGCTTTCAGCCCAGAGGTAATACCC 

ATGTTTACAGCATTATCAGAAGGAGCTACCCCACAAGATTTAAACACCAT 

GTTAAATACAGTGGGGGGACACCAAGCAGCCATGCAAATGTTAAAAGAT 

ACCATCAATGAGGAGGCTGCAGAATGGGATAGGTTACATCCAGTGCATGC 

AGGGCCTATTGCACCAGGCCAAATGAGAGAACCAAGGGGAAGTGACATA 

GCAGGAACTACTAGTACCCTTCAAGAACAAATAGCATGGATGACAAGTAA 

CCCACCTATTCCGGTGGGAGACATCTATAAAAGATGGATAATTCTGGGGT 

TAAATAAAATAGTAAGAATGTATAGCCCTGTCAGCATTTTGGACATAAAA 

CAAGGGCCAAAAGAACCCTTTAGAGACTATGTAGACCGATTCTTTAAAAC 

TTTAAGGGCTGAACAATCTTCACAAGAGGTAAAAAATTGGATGACAGACA 

CC'rTGTTGGTCCAAAATGCAAACCCAGATTGTAAGACCATnTAAGAGCA 

TTAGGACCAGGGGCTACATTAGAGGAAATGATGACAGCATGTCAGGGAGT 

AGGAGGACCTGGCCACAAAGCAAGAGTTTTGGCTGAGGCAATGAGCCAA 

GCAAATACAAACATAATGATGCAGAAAAGCAATnTAAAGGCCCTAAAA 

GAACTGTTAAATGTTTCAATTGTGGCAAGGAAGGGCATATAGCCAGAAAT 

TGCAGGGCCCCTAGGAAAAAGGGCTGTTGGAAATGTGGAAAGGAAGGAC 

ACCAAATGAAAGACTOTACTGAAAGGCAGGCrAATTTTTTAGGGAAAATT 

TGGCCTTCCTACAAGGGGAGGCCGGGGAATTTCCTrCAGAGCAGACCAGA 

ACCATCAGCCCCACCAGCAGAGAGCTTCAGGTTCGAGGAGCAGGAGCCG 

AAAGACAAGGAACCACCCTTAACTTCCCTCAAATCACTCTTTGGCAGCGA 

CCCCTTGTCTCAATAAAAGTAGAGGGCCAGATAAAGGAGGCTCTCTTAGA 

TACAGGAGCAGATGATACAGTATTAGAAGAAATAAATTTGCCAGGAAAAT 
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FIGURE 103 (SEQ ID NO:182) 
(Sheet 2 of 5) 

GGAAACCAAAAATGATAGGAGGAATTGGAGGTnTATCAAAGTAAGACA 

GTATGAGCAAATACTTATAGAAATTTGTGGAAAAAAGGCTATAGGAACAG 

TATTAGTAGGACCTACACCTGTCAACATAATTGGAAGAAATATGTTGACT 

CAGCrrGGATGCACACrAAATTTTCCAATrAGTCCCATTGAAACTGTACCA 

GTAAAATTAAAGCCAGGAATGGATGGCCCAAGAGTTAAACAATGGCCATT 

GACAGAAGAAAAAATAAAAGCATTAACAGCAATTTGTGAAGAAATGGAG 

AAGGAAGGAAAAATTACAAAAATTGGGCCTGAAAATCCATATAACACTCC 

AGTATTTGCCATAAAAAAGAAGGACAGTACTAAGTGGAGAAAATTAGTA 

GATTTCAGGGAACTCAATAAAAGAACTCAAGACTTTTGGGAAGTTCAATT 

AGGAATACCACACCCAGCAGGGTTAAAAAAGAAAAAATCAGTGACAGtG 

CTGGATGTGGGGGATGCATATTTTTCAGTTCCTTTAGATGAAAGCTTCAGG 

AAATATACTGCATTCACCATACCTAGTATAAACAATGAAGCACCAGGGAT 

TAGATATCAATATAATGTGCTTCCACAGGGGTGGAAAGGATCACCAGCAA 

TATTCCAGTGTAGCATGACAAAAATCTTAGAGCCTTATAGGAAACAAAAT 

CCAAACATAGTTATCTATCAATATATGGATGATTTGTATGTAGGATCTGAC 

TTAGAAATAGGGCAACATAGAGCAAAAATAGAGGAGTTAAGAGAACATT 

TATTGAGGTGGGGACTTACCACACCAGACAAGAAACATCAGAAAGAACC 

CCCATTTCTCTGGATGGGGTATGAACTACATCCTGACAAATGGACAGTAC 

AGCCTATACTGCTGCCAGAAAAGGATAGCTGGACTGTCAATGATATACAG 

AAGTTAGTGGGAAAGTTAAACTGGGCCAGTCAGATTTACCCAGGGATTAA 

AGTAAAGTACTTGTGCAAACTCCTTAGGGGAGCCAAAGCACTAACAGACA 

TAGTACCACTGACTGAAGAAGCTGAATTAGAATTGGCAGAGAACAGGGA 

AATTCTAAAAGAACCAGTACATGGAGTATATTATGACCCCTCAAAAGACT 

TAATAGCTGAAATACAGAAACAGGGGCATGACCAATGGACATACCAAATT 

TACCAAGAACCATTCAAAAATCTGAAAACAGGGAAGTATGCAAAAATGA 

GGACTGCCCACACTAATGATGTAAAACAGTTAACAGAAGCAGTGCAAAA 

AATAGCTCTAGAAAGCATAGTAATATGGGGAAAGACTCCTAAATTCAGAC 

TACCCATCCAAAAAGAAACATGGGAGACATGGTGGACAGACTATTGGCA 

AGCCACCTGGATCCCTGAATGGGAGTTTGTTAATACCCCTCCCCTAGTAAA 

ATTATGGTACCAACTGGAAAAAGAACCCATAGCAGGGGTAGAGACTTTCT 

ATGTAGATGGAGCAGCTAACAGGGAAACTAAAATAGGAAAAGCAGGGTA 

TGTTACTGACAAAGGAAGACAGAAAATTGTTACTCTAAATGAAACAACAA 

ATCAGAAGGCTGAGTTACAAGCAATTCAGCTAGCTTTGCAGGATTCAGGA 

TCAGAAGCAAACATAGTAACAGACTCACAGTATGCATTAGGAATTATTCA 

AGCACAACCAGATAAGAGTGAATCAGAGTTAGTTAACCAGATAATAGAA 

CAGTTAATAAACAAGGAGAGAATCTACCTGTCATGGGTACCAGCACATAA 

AGGAATTGGAGGAAATGAACAAGTAGACAAATTAGTAAGTAGTGGAATC 

AGGAAAGTGCTGTTTCTAGATGGGATAGATAAGGCrCAAGAAGAGCATGA 

AAAATATCACAGCAATTGGAGAGCAATGGCTAGTGAGTTTAATCTGCCAC 

CCATAGTAGCAAAAGAAATAGTAGCCAGCTGTGATAAATGTCAGCTAAAA 

GGGGAAGCCATACATGGACAAGTCGACTGTAGTCCAGGAATATGGCAATT 

AGATTGTACACATTTAGAAGGAAAAATCATCCTGGTAGCAGTCCATGTAG 

CCAGTGGCTACATAGAAGCAGAGGTTATCCCAGCAGAAACAGGACAAGA 

AACAGCATATTATATACTAAAATTAGCAGGAAGATGGCCAGTTAAAATAA 

TACATACAGATAATGGCAGTAATTTCACCAGTGCTGCAGTTAAAGCAGCC 

TGTTGGTGGGCAGGAATCCAACAGGAATTTGGAATTCCCTACAATCCCCA 

AAGTCAGGGAGTAGTAGAATCCATGAATAAAGAATTAAAGAAAATCATA 

GGGCAGGTAAGAGATCAAGCTGAGCACCTCAAGACAGCAGTACAAATGG 
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CAGTATTCATTCACAATTTTAAAAGAAAAGGGGGGATTGGGGGGTACAGT 

GCAGGGGAAAGGATAATAGACATAATAGCAACAGACATACAAACTAGAG 

AATTACAAAAACAAATTATAAAAATTCAAAATTTTCGGGTTTATTACAGG 

GACAGCAGAGACCCTATTTGGAAAGGACCAGCCAAACTACTCTGGAAAG 

GTGAAGGGGCAGTAGTAATACAAGATAATAGTGACATAAAGGTAGTACC 

AAGGAGGAAAGTAAAAATCATTAAGGACTATGGAAAACAGATGGCAGGT 

GCTGATTGTGTGGCAGGTAGACAGGATGAAGATTAGAACATGGAATAGTT 

TGGTAAAGCATCACATATATATTTCAAGGAGAGCTAATGGATGGTnTAC 

AGACATCATTATGAAAGCAGACACCCAAAAATAAGTTCAGAAGTACACAT 

CCCATTAGGGGATGCTAGATTAGTAATAAAAACATATTGGGGTTTGCATA 

CAGGAGAAAGAGATTGGCATTTGGGTCATGGAGTCTCCATAGAATGGAAA 

TTGAGAAAATATAGCACACAAGTAGACCCTGGCCTGGCAGACCAGCrAAT 

TCATGTGCATTATTTTGATTGTTTTGCAGACTCTGCCATAAGACAAGCCAT 

ATTAGGACACATAGTTATTCCTAGGTGTGACTATCAAGCAGGACATAATA 

AGGTAGGATCTCTACAATACTTGGCACTGACAGCATTGATAAAACCAAAA 

AAGAGAAAGCCACCTTTGCATAGTGTTAGGAAATTAGTAGAGGATAGATG 

GAACAAGCCCCAGAAGACCAGGGACCGCAGAGGGAACCATACAATGAAT 

GGACACTAGAGCTTTTAGAGGAACTCAAACAGGAAGCTGTCAGACACTTT 

CCTAGACCATGGCTCCATAGCTTAGGGCAACATATCTATAACACCTATGG 

GGATACTTGGACAGGAGTAGAAGCTATAATAAGAATTCTGCAACAACTAC 

TGTTTATTCAnTCAGAATTGGGTGCCAGCATAGCAGAATAGGCATTATGC 

GACAGAGAAGAGCAAGAAATGGAACCAGTAGATCCTAAACTTGAGCCCT 

GGAAACATCCAGGAAGTCAGCCTAAAACTCCTTGTAATAATTGCTATTGC 

AAAAAATGTAGCTATCATTGTCTAGTTTGCnTCAGAAAAAAGGCTTAGG 

CATTTCATATGGCAGGAAGAAGCGGAGACAACGACGAAGCACTCCTCCAA 

GCAGTGAGGATCATCAAAATCTTATATCAAAGCAGTAAGTACTAAATGGT 

AGATGTAATGTTAAGTTTTCTAGAAAAAGTAGATTATGAAATAGGAGTAG 

CAGCATTTATAATAGCACTAATCATAGCAATAGTTGTGTGGATCATAGTAT 

ATATAGAATATAGGAAATTGTTAAGACAAAAAAGAATAGACTGGTTAATT 

GAAAGAATTAGAGAAAGGGCAGAAGACAGTGGCAATGAGAGTGATGGGG 

AGCAGGAGGAATTATCAACAATGGTGGATATGGGGAATCTTAGGCTTTTG 

GATGCTAATGGTTGGTAATGTAATGGGGAACTTGTGGGTCACAGTCTATT 

ATGGGGTACCTGTGTGGAAAGACGCAAAAGCTACTCTATTTTGTGCATCT 

GATGCTAAAGCATATGAGAAAGAAGTGCATAATGTCTGGGCTACACATGC 

CTGTGTACCCACAGACCCCGACCCACAAGAAATAGITTTGGAGAATGTAA 

CAGAAAATTTTAACATGTGGAAAAATAACATGGTGGACCAGATGCATGAG 

GATATAATCAGCTTATGGGATCAAAGCCTAAAGCCATGTGTAAAGTTGAC 

CCCACTCTGTGTCACTTTAAACTGTAGCAATAATGTTAAAAATGCTACCAA 

CAGTATGAAGGAAATGAAAAATTGCACTTTCAATATAACCACAGAACTAA 

GAGATAAGAGAAAGCAAGAATATGCACTTTTTTATAAACTTGATATAGTA 

CCACTTGAGGAGAATTCCAGTAAGTATAGA1TAATAAATTGTAATACCTC 

AGCCATAACCCAAGCCTGTCCAAAGGTCTCTTTTGACCCAATTCCTATACA 

TTATTGTGCTCCAGCTGGTTATGCGATTCTAAAGTGTAATAATAAGACATT 

CAATGGAACAGGACCATGCAATAATGTCAGCACGGTACAATGTACACATG 

GAATTAAGCCAGTAGTATCAACTCAACTACTGTTAAATGGTAGTCTAGCA 

GAAGAAGAAATAGTAATTAGATCTGAAAATATGACAAACAATGCCAAAA 

TAATAATAGTACATCTTAATGAATCTGTAGAAATTACGTGTACAAGGCCC 

AACAATAATACAAGAAAAAGTATGAGGATAGGACCAGGACAAACATTCT 
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ATGCAACAGGAGACATAATAGGAGATATAAGACAAGCACACTGTAACAT 

TAGTGAAAAGCAATGGGATCAGACTTTATACAGGGTAAGTGAAAAATTAA 

AAGAACACTTCCCTAATAAAACAATAAAGTTTAACTCATCCTCAGGAGGG 

GACTTAGAAATTACAACACATAGCTTTAATTGTGGAGGAGAGTTTTTCTAT 

TGCAATACATCTGTACTGTTTAATGGCACATACAGTAATGGCACAAACAG 

TACAAATACAACAGTCATCACACTCCCATGCAGAATAAAACAAATTATAA 

ACATGTGGCAGGGGGTAGGACGAGCAATGTATGCCCCrCCCATTGCAGGA 

AACATAACATGTAGATCAAACATCACAGGACTAATATTGACACGTGATGG 

AGGGCAGGGAGAGAATGACACAAATGAGATATTTAGACCTGCAGGAGGA 

GATATGAGGGACAATTGGAGAAGTGAATTATACAAATATAAAGTGGTAG 

AAATTCAGCCATTAGGAGTAGCACCCACTAAGGCAAAAAGGAGAGTGGT 

GGAGAGAGAAAAAAGAGCAGCTTTGGGAGCTGTGTTCCTTGGGTTCTTGG 

GAGCAGCAGGAAGCACTATGGGCGCGGCATCAATAATGCTGACGGTACA 

GGCCAGACAACTGTTGTCTGGTATAGTGCAACAGCAAAGCAATTTGCTGA 

GAGCTGTAGAGGCGCAACAGCATATGTTGCAACTCACGGTCTGGGGCATT 

AAGTAGCTCCAGACAAGAGTCCTGGCTATAGAAAGATACCTAAAGGATCA 

ACAGCTCCTAGGGATTTGGGGCTGCTCTGGAAAACTCATCTGCACCACTG 

CCGTGCCTTGGAACAATAGTTGGAGTAATAAATCTCAAGATTATATTTGG 

GGAAACATGACCTGGATGCAATGGGATAAAGAAATTAGCAATTACACAG 

AAACAATATACAGGTTGCTTGGGGACGCGCAAAACCAGCAGGAGAAAAA 

TGAAAAGGAGTTACTAGAATTGGACAGGTGGGGAAATCTGTGGAACTGGT 

TTGACATAACAAAATGGCTGTGGTATATAAAAATATTCATAATGGTAATA 

GGAGGCTTGATAGGTTTAAGAATAATTTITGCTGTGCnTCTATAGTAAAT 

AGAGTTAGGCAGGGATACTCACCTTTGTCATTTCAGACCCTTGCCCAAAAC 

CCGAGGGGACCCGACAGGCTCGGAAGAACCGAAGAAGAAGGTGGAGAGC 

AAGACAGAGACAGATCCATAAGATTAGTGAGCGGATTCTTAGCACTTGCC 

TGGGAGGACCTGAGGAACCTGTGCAnTTCCTCTACCACCGATTGAGAGA 

CTTCATATTGGTGACAGCGAGAGCAGTGGAACTTCTGGGACGCAGCAGTC 

TCAGGGGACTCCAGAGGGGGTGGGAAATCCTTAAGTACCTGGGAAGTCTT 

GTGCAGTATTGGGGTCTAGAGCTAAAAAAGAGTGCTGTTAGTCTGCTTGA 

TAGCGTAGCAATAGCAGTAGCTGAGGGAACAGATAGAATTATAGAATTCT 

TACAAGGAACTGGTAGAGCTATCTACAACATACCTAGAAGAATAAGACAG 

GGCTTTGAAGCAGCTTTGCAGTAAAATGGGAAATAAGTGGTCAAAAAGCT 

GGCCTGCTGTAAGAGAAAGAATATGGAAAACTAGGCCAGCAGCAGCAGA 

AGCAGCTAGGCCAGCAGCAGCAGAAGGAGTAGGAGCAGCGTCTCAAGAC 

TTGGATAAACGTGGGGCGCTTACAATCAACAACACAGCCAACAATAATCC 

TGATTGTGCCTGGCTGGAAGCGCAAGAGGATGAGGAAGTAGGCTTTCCAG 

TCAGACCTCAGGTACCTTTAAGACCAATGACATATAAGGCAGCATTTGAT 

CTCAGCTTCTTTTTAAAAGAAAAGGGGGGACrGGAAGGGTTAATTTACTC 

CAGGAAAAGGCAAGAGATCCTTGATTTATGGGTCTATCACACACAAGGCr 

ACTTCCCTGATTGGCAAAACTACACACCGGGACCAGGGGTCAGATATCCA 

CTGACCTTTGGATGGTGCTTCAAGCTAGTGCCAGTTGACCCAAGGGAAGT 

AGAAGAGGCCAACGGAGGAGAAGACAACTGTTTGCTACACCCTATGAGC 

CAGTATGGAATGGATGATGAACACAAAGAAGTGCTACAGTGGAAGTTTGA 

CAGCAGCCTAGCACGCAGACACCTGGCCCGCGAGCTACATCCGGATTATT 

ACAAAGACTGCTGACACAGAAGGGACTTTCCGCCTGGGACrTTCCACTGG 

GGCGTTCCAGGGGGAGTGGTCTGGGCGGGACTGGGAGTGGCCAGCCCTCA 

GATGCTGCATATAAGCAGCTGCTTTTCGCCTGTACTGGGTCTCTCTAGGTA 
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GACCAGATCTGAGCCTGGGAGCTCTCTGTCTATCTGGGGAACCCACTGCTT 
AAGCCTCAATAAAGCTTGCCTTGAGTGCTCTAAGTAGTGTGTGCCCATCTG 
TTGTGTGACTCTGGTAACTCTGGTAACTAGAGATCCCTCAGACCCTTTGTG 
GTAGTGTGGAAAATCTCTAGCA 



138/158 



wo 03/004620 



PCTAJS02/21420 



HGURE 104 (SEQ ID NO:183) 

gp 140.modTV 1 .mutl .d V2 

1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 
61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 
121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 
301 cagatgcacg aggacg^at cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 
361 acccccctgt gcgtgaccct gaactgcacx: gacaccaacg tgaccggcaa ccgcaccgtg 
421 accggcaaca gcaccaacaa caccaacggc accggcatct acaacatcga ggagatgaag 
481 aactgcagct tcaacgccgg cgccggccgc ctgatcaact gcaacaccag caccatcacc 
541 caggcctgcc ccaaggtgag cttcgacccc atccccatcc actactgcgc ccccgccggc 
601 tacgccatcc tgaagtgcaa caacaagacc ttcaacggca ccggcccctg ctacaacgtg 
651 agcaccgtgc agtgcaccca cggcatcaag cccgtggtga gcacccagct gctgctgaac 
721 ggcagcctgg ccgaggaggg catcatcatc cgcagcgaga acctgaccga gaacaccaag 
781 accatcatcg tgcacctgaa cgagagcgtg gagatcaact gcacccgccc caacaacaac 
841 acccgcaaga gcgtgcgcat cggccccggc caggccttct acgccaccaa cgacgtgatc 
901 ggcaacatcc gccaggccca ctgcaacatc agcaccgacc gctggaacaa gaccctgcag 
961 caggtgatga agaagctggg cgagcacttc cccaacaaga ccatccagtt caagccccac 
1021 gccggcggcg acctggagat caccatgcac agcttcaact gccgcggcga gttcttctac 
1081 tgcaacacca gcaacctgttcaacagcacc taccacagca acaacggcac ctacaagtac 
1 141 aacggcaaca gcagcagccc catcaccctg cagtgcaaga tcaagcagat cgtgcgcatg 
1201 tggcagggcg tgggccaggc cacctacgcc ccccccatcg ccggcaacat cacctgccgc 
1261 agcaacatca ccggcatcct gctgacccgc gacggcggct tcaacaccac caacaacacc 
1321 gagaccttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 
1381 tacaaggtgg tggagatcaa gcccctgggc atcgccccca ccaaggccaa gcgccgcgtg 
1441 gtgcagcgcg agaagagcgc cgtgggcatc ggcgccgtgt tcctgggctt cctgggcgcc 
1501 gccggcagca ccatgggcgc cgccagcatc accctgaccg tgcaggcccg ccagctgctg 
1561 agcggcatcg tgcagcagca gagcaacctg ctgaaggcca tcgaggccca gcagcacatg 
1621 ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc catcgagcgc 
1 68 1 tacctgaagg accagcagct gctgggcatc tggggctgca gcggccgcct gatctgcacc 
1741 accgccgtgc cctggaacag cagctggagc aacaagagcg agaaggacat ctgggacaac 
1801 atgacctgga tgcagtggga ccgcgagatc agcaactaca ccggcctgat ctacaacctg 
1861 ctggaggaca gccagaacca gcaggagaag aacgagaagg acctgctgga gctggacaag 
1921 tggaacaacc tgtggaactg gttcgacatc agcaactggc cctggtacat ctaa 
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gp 140mod.TVl.mut2.dV2 

1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 
61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 
121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 
301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 
361 acccccctgt gcgtgaccct gaactgcacc gacaccaacg tgaccggcaa ccgcaccgtg 
421 accggcaaca gcaccaacaa caccaacggc accggcatct acaacatcga ggagatgaag 
48 1 aactgcagct tcaacgccgg cgccggccgc ctgatcaact gcaacaccag caccatcacc 
541 caggcctgcc ccaaggtgag cttcgacccc atccccatcc actactgcgc ccccgccggc 
601 tacgccatcc tgaagtgcaa caacaagacc ttcaacggca ccggcccctg ctacaacgtg 
661 agcaccgtgc agtgcaccca cggcatcaag cccgtggtga gcacccagct gctgctgaac 
721 ggcagcctgg ccgaggaggg catcatcatc cgcagcgaga acctgaccga gaacaccaag 
781 accatcatcg tgcacctgaa cgagagcgtg gagatcaact gcacccgccc caacaacaac 
841 acccgcaaga gcgtgcgcat cggccccggc caggccttct acgccaccaa cgacgtgatc 
901 ggcaacatcc gccaggccca ctgcaacatc agcaccgacc gctggaacaa gaccctgcag 
961 caggtgatga agaagctggg cgagcacttc cccaacaaga ccatccagtt caagccccac 
1021 gccggcggcg acctggagat caccatgcac agcttcaact gccgcggcga gttcttctac 
1081 tgcaacacca gcaacctgtt caacagcacc taccacagca acaacggcac ctacaagtac 
1141 aacggcaaca gcagcagccc catcaccctg cagtgcaaga tcaagcagat cgtgcgcatg 
1201 tggcagggcg tgggccaggc cacctacgcc ccccccatcg ccggcaacat cacctgccgc 
1261 agcaacatca ccggcatcct gctgacccgc gacggcggct tcaacaccac caacaacacc 
1321 gagaccttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 
1381 tacaaggtgg tggagatcaa gcccctgggc atcgccccca ccaaggccaa gcgccgcgtg 
1441 gtgcagagcg agaagagcgc cgtgggcatc ggcgccgtgt tcctgggctt cctgggcgcc 
1501 gccggcagca ccatgggcgc cgccagcatc accctgaccg tgcaggcccg ccagctgctg 
1561 agcggcatcg tgcagcagca gagcaacctg c^aaggcca tcgaggccca gcagcacatg 
1621 ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc catcgagcgc 
1681 tacctgaagg accagcagct gctgggcatc tggggctgca gcggccgcct gatctgcacc 
1741 accgccgtgc cctggaacag cagctggagc aacaagagcg agaaggacat ctgggacaac 
1801 atgacctgga tgcagtggga ccgcgagatc agcaactaca ccggcctgat ctacaacctg 
1 861 ctggaggaca gccagaacca gcaggagaag aacgagaagg acctgctgga gctggacaag 
1 92 1 tggaacaacc tgtggaactg gttcgacatc agcaactggc cctggtacat ctaa 



140/158 



wo 03/004620 



PCT/US02/21420 



HGURE 106 (SEQ ID NO: 185) 

gpl40mod.TVl.niut3.dV2 

I atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 
61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 
121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
1 8 i accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 
301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 
361 acccccctgt gcgtgaccct gaactgcacc gacaccaacg tgaccggcaa ccgcaccgtg 
421 accggcaaca gcaccaacaa caccaacggc accggcatct acaacatcga ggagatgaag 
481 aactgcagct tcaacgccgg cgccggccgc ctgatcaact gcaacaccag caccatcacc 
541 caggcctgcc ccaaggtgag cttcgacccc atccccatcc actactgcgc ccccgccggc 
601 tacgccatcc tgaagtgcaa caacaagacc ttcaacggca ccggcccctg ctacaacgtg 
661 agcaccgtgc agtgcaccca cggcatcaag cccgtggtga gcacccagct gctgctgaac 
721 ggcagcctgg ccgaggaggg catcatcatc cgcagcgaga acctgaccga gaacaccaag 
781 accatcatcg tgcacctgaa cgagagcgtg gagatcaact gcacccgccc caacaacaac 
841 acccgcaaga gcgtgcgcat cggccccggc caggccttct acgccaccaa cgacgtgatc 
901 ggcaacatcc gccaggccca ctgcaacatc agcaccgacc gctggaacaa gaccctgcag 
96 1 caggtgatga agaagctggg cgagcacttc cccaacaaga ccatccagtt caagccccac 
1021 gccggcggcg acctggagat caccatgcac agcttcaact gccgcggcga gttcttctac 
108 1 tgcaacacca gcaacctgtt caacagcacc taccacagca acaacggcac ctacaagtac 
1141 aacggcaaca gcagcagccc catcaccctg cagtgcaaga tcaagcagat cgtgcgcatg 
1201 tggcagggcg tgggccaggc cacctacgcc ccccccatcg ccggcaacat cacctgccgc 
1261 agcaacatca ccggcatcct gctgacccgc gacggcggct tcaacaccac caacaacacc 
1321 gagaccttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 
1381 tacaaggtgg tggagatcaa gcccctgggc atcgccccca ccaaggccaa gcgcagcgtg 
1441 gtgcagagcg agaagagcgc cgtgggcatc ggcgccgtgt tcctgggctt cctgggcgcc 
1501 gccggcagca ccatgggcgc cgccagcatc accctgaccg tgcaggcccg ccagctgctg 
1 561 agcggcatcg tgcagcagca gagcaacctg ctgaaggcca tcgaggccca gcagcacatg 
1621 ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc catcgagcgc 
1 68 1 tacctgaagg accagcagct gctgggcatc tggggctgca gcggccgcct gatctgcacc 
174 1 accgccgtgc cctggaacag cagctggagc aacaagagcg agaaggacat ctgggacaac 
1801 atgacctgga tgcagtggga ccgcgagatc agcaactaca ccggcctgat ctacaacctg 
1861 ctggaggaca gccagaacca gcaggagaag aacgagaagg acctgctgga gctggacaag 
1921 tggaacaacc tgtggaactg gttcgacatc agcaactggc cctggtacat ctaa 
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HGURE 107 (SEQ ID NO: 186) 



gp 140mod,TVLmut4.dV2 

1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 
61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 
121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 
301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 
361 acccccctgt gcgtgaccct gaactgcacc gacacxaacg tgaccggcaa ccgcaccgtg 
421 accggcaaca gcaccaacaa caccaacggc accggcatct acaacatcga ggagatgaag 
481 aactgcagct tcaacgccgg cgccggccgc ctgatcaact gcaacaccag caccatcacc 
541 caggcctgcc ccaaggtgag cttcgacccc atccccatcc actactgcgc ccccgccggc 
601 tacgccatcc tgaagtgcaa caacaagacc ttcaacggca ccggcccctg ctacaacgtg 
661 agcaccgtgc agtgcaccca cggcatcaag cccgtggtga gcacccagct gctgctgaac 
721 ggcagcctgg ccgaggaggg catcatcatc cgcagcgaga acctgaccga gaacaccaag 
78 1 accatcatcg tgcacctgaa cgagagcgtg gagatcaact gcacccgccc caacaacaac 
841 acccgcaaga gcgtgcgcat cggccccggc caggccttct acgccaccaa cgacgtgatc 
901 ggcaacatcc gccaggccca ctgcaacatc agcaccgacc gctggaacaa gaccctgcag 
961 caggtgatga agaagctggg cgagcacttc cccaacaaga ccatccagtt caagccccac 
1021 gccggcggcg acctggagat caccatgcac agcttcaact gccgcggcga gttcttctac 
1081 tgcaacacca gcaacctgtt caacagcacc taccacagca acaacggcac ctacaagtac 
1141 aacggcaaca gcagcagccc catcaccctg cagtgcaaga tcaagcagat cgtgcgcatg 
1201 tggcagggcg tgggccaggc cacctacgcc ccccccatcg ccggcaacat cacctgccgc 
1261 agcaacatca ccggcatcct gctgacccgc gacggcggct tcaacaccac caacaacacc 
1321 gagaccttcc gccccggcgg cggcgacatg cgcgacaact ggcgcagcga gctgtacaag 
1381 tacaaggtgg tggagatcaa gcccctgggc atcgccccca ccaaggccaa gagcagcgtg 
1441 gtgcagagcg agaagagcgc cgtgggcatc ggcgccgtgt tcctgggctt cctgggcgcc 
1501 gccggcagca ccatgggcgc cgccagcatc accctgaccg tgcaggcccg ccagctgctg 
1561 agcggcatcg tgcagcagca gagcaacctg ctgaaggcca tcgaggccca gcagcacatg 
1621 ctgcagctga ccgtgtgggg catcaagcag ctgcaggccc gcgtgctggc catcgagcgc 
168 1 tacctgaagg accagcagct gctgggcatc tggggctgca gcggccgcct gatctgcacc 
1741 accgccgtgc cctggaacag cagctggagc aacaagagcg agaaggacat ctgggacaac 
1801 atgacctgga tgcagtggga ccgcgagatc agcaactaca ccggcctgat ctacaacctg 
1861 ctggaggaca gccagaacca gcaggagaag aacgagaagg acctgctgga gctggacaag 
1921 tggaacaacc tgtggaactg gttcgacatc agcaactggc cctggtacat ctaa 



142/158 



wo 03/004620 



PCTAJS02/21420 



HGURE 108 (SEQ ID NO: 187) 

gpl40.modTVl.GM161 

1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 
61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 
121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 
301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 
361 acccccctgt gcgtgaccct gaactgcacc gacaccaacg tgaccggcaa ccgcaccgtg 
421 accggcaaca gcaccaacaa caccaacggc accggcatct acaacatcga ggagatgaag 
48 1 cagtgcagct tcaacgccac caccgagctg cgcgacaaga agcacaagga gtacgccctg 
541 ttctaccgcc tggacatcgt gcccctgaac gagaacagcg acaacttcac ctaccgcctg 
60 1 atcaactgca acaccagcac catcacccag gcctgcccca aggtgagctt cgaccccatc 
661 cccatccact actgcgcccc cgccggctac gccatcctga agtgcaacaa caagaccttc 
721 aacggcaccg gcccctgcta caacgtgagc accgtgcagt gcacccacgg catcaagccc 
78 1 gtggtgagca cccagctgct gctgaacggc agcctggccg aggagggcat catcatccgc 
841 agcgagaacc tgaccgagaa caccaagacc atcatcgtgc acctgaacga gagcgtggag 
901 atcaactgca cccgccccaa caacaacacc cgcaagagcg tgcgcatcgg ccccggccag 
961 gccttctacg ccaccaacga cgtgatcggc aacatccgcc aggcccactg caacatcagc 
1021 accgaccgct ggaacaagac cctgcagcag gtgatgaaga agctgggcga gcacttcccc 
1081 aacaagacca tccagttcaa gccccacgcc ggcggcgacc tggagatcac catgcacagc 
1 141 ttcaactgcc gcggcgagtt cttctactgc aacaccagca acctgttcaa cagcacctac 
1 201 cacagcaaca acggcaccta caagtacaac ggcaacagca gcagccccat caccctgcag 
1261 tgcaagatca agcagatcgt gcgcatgtgg cagggcgtgg gccaggccac ctacgccccc 
1321 cccatcgccg gcaacatcac ctgccgcagc aacatcaccg gcatcctgct.gacccgcgac 
1 38 1 ggcggcttca acaccaccaa caacaccgag accttccgcc ccggcggcgg cgacatgcgc 
1441 gacaactggc gcagcgagct gtacaagtac aaggtggtgg agatcaagcc cctgggcatc 
1501 gcccccacca aggccaagcg ccgcgtggtg cagcgcgaga agcgcgccgt gggcatcggc 
1561 gccgtgttcc tgggcttcct gggcgccgcc ggcagcacca tgggcgccgc cagcatcacc 
1621 ctgaccgtgc aggcccgcca gctgctgagc ggcatcgtgc agcagcagag caacctgctg 
1681 aaggccatcg aggcccagca gcacatgctg cagctgaccg tgtggggcat caagcagctg 
1741 caggcccgcg tgctggccat cgagcgctac ctgaaggacc agcagctgct gggcatctgg 
1801 ggctgcagcg gccgcctgat ctgcaccacc gccgtgccct ggaacagcag ctggagcaac 
1861 aagagcgaga aggacatctg ggacaacatg acctggatgc agtgggaccg cgagatcagc 
1921 aactacaccg gcctgatcta caacctgctg gaggacagcc agaaccagca ggagaagaac 
1981 gagaaggacc tgctggagct ggacaagtgg aacaacctgt ggaactggtt cgacatcagc 
2041 aactggccct ggtacatctaa 
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HGURE 109 (SEQ ID NO: 188) 

gpl40inod.TVl.GM161-195-204 

I atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 
6 1 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 
121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
1 8 1 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 
301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 
361 acccccctgt gcgtgaccct gaactgcacc gacaccaacg tgaccggcaa ccgcaccgtg 
421 accggcaaca gcaccaacaa caccaacggc accggcatct acaacatcga ggagatgaag 
481 cagtgcagct tcaacgccac caccgagctg cgcgacaaga agcacaagga gtacgccctg 
541 ttctaccgcc tggacatcgt gcccctgaac gagaacagcg accagttcac ctaccgcctg 
601 atcaactgcc agaccagcac catcacccag gcctgcccca aggtgagctt cgaccccatc 
661 cccatccact actgcgcccc cgccggctac gccatcctga agtgcaacaa caagaccttc 
721 aacggcaccg gcccctgcta caacgtgagc accgtgcagt gcacccacgg catcaagccc 
781 gtggtgagca cccagctgct gctgaacggc agcxtggccg aggagggcat catcatccgc 
841 agcgagaacc tgaccgagaa caccaagacc atcatcgtgc acctgaacga gagcgtggag 
901 atcaactgca cccgccccaa caacaacacc cgcaagagcg tgcgcatcgg ccccggccag 
96 1 gccttctacg ccaccaacga cgtgatcggc aacatccgcc aggcccactg caacatcagc 
102 1 accgaccgct ggaacaagac cctgcagcag gtgatgaaga agctgggcga gcacttcccc 
1081 aacaagacca tccagttcaa gccccacgcc ggcggcgacc tggagatcac catgcacagc 
1 141 ttcaactgcc gcggcgagtt cttctactgc aacaccagca acctgttcaa cagcacctac 
1201 cacagcaaca acggcaccta caagtacaac ggcaacagca gcagccccat caccctgcag 
1261 tgcaagatca agcagatcgt gcgcatgtgg cagggcgtgg gccaggccac ctacgccccc 
1321 cccatcgccg gcaacatcac ctgccgcagc aacatcaccg gcatcctgct gacccgcgac 
1381 ggcggcttca acaccaccaa caacaccgag accttccgcc ccggcggcgg cgacatgcgc 
1441 gacaactggc gcagcgagct gtacaagtac aaggtggtgg agatcaagcc cctgggcatc 
1501 gcccccacca aggccaagcg ccgcgtggtg cagcgcgaga agcgcgccgt gggcatcggc 
1561 gccgtgttcc tgggcttcct gggcgccgcc ggcagcacca tgggcgccgc cagcatcacc 
1621 ctgaccgtgc aggcccgcca gctgctgagc ggcatcgtgc agcagcagag caacctgctg 
1 68 1 aaggccatcg aggcccagca gcacatgctg cagctgaccg tgtggggcat caagcagctg 
1 741 caggcccgcg tgctggccat cgagcgctac ctgaaggacc agcagctgct gggcatctgg 
1801 ggctgcagcg gccgcctgat ctgcaccacc gccgtgccct ggaacagcag ctggagcaac 
1861 aagagcgaga aggacatctg ggacaacatg acctggatgc agtgggaccg cgagatcagc 
1921 aactacaccg gcctgatcta caacctgctg gaggacagcc agaaccagca ggagaagaac 
1981 gagaaggacc tgctggagct ggacaagtgg aacaacctgt ggaactggtt cgacatcagc 
2041 aactggccct ggtacatcta a 
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HGURE 1 10 (SEQ ID NO: 189) 
gpl40mod.TVl .GM161-204 

1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 
61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 
1 2 1 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 
301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 
361 acccccctgt gcgtgaccct gaactgcacc gacaccaacg tgaccggcaa ccgcaccgtg 
421 accggcaaca gcaccaacaa caccaacggc accggcatct acaacatcga ggagatgaag 
481 cagtgcagct tcaacgccac caccgagctg cgcgacaaga agcacaagga gtacgccctg 
541 ttctaccgcc tggacatcgt gcccctgaac gagaacagcg acaacttcac ctaccgcctg 
601 atcaactgcc agaccagcac catcacccag gcctgcccca aggtgagctt cgaccccatc 
66 J cccatccact actgcgcccc cgccggctac gccatcctga agtgcaacaa caagaccttc 
721 aacggcaccg gcccctgcta caacgtgagc accgtgcagt gcacccacgg catcaagccc 
781 gtggtgagca cccagctgct gctgaacggc agcctggccg aggagggcat catcatccgc 
841 agcgagaacc tgaccgagaa caccaagacc atcatcgtgc acctgaacga gagcgtggag 
901 atcaactgca cccgccccaa caacaacacc cgcaagagcg tgcgcatcgg ccccggccag 
961 gccttctacg ccaccaacga cgtgatcggc aacatccgcc aggcccactg caacatcagc 
1021 accgaccgct ggaacaagac cctgcagcag gtgatgaaga agctgggcga gcacttcccc 
1081 aacaagacca tccagttcaa gccccacgcc ggcggcgacc tggagatcac catgcacagc 
1141 ttcaactgcc gcggcgagtt cttctactgc aacaccagca acctgttcaa cagcacctac 
1201 cacagcaaca acggcaccta caagtacaac ggcaacagca gcagccccat caccctgcag 
1261 tgcaagatca agcagatcgt gcgcatgtgg cagggcgtgg gccaggccac ctacgccccc 
1321 cccatcgccg gcaacatcac ctgccgcagc aacatcaccg gcatcctgct gacccgcgac 
1381 ggcggcttca acaccaccaa caacaccgag accttccgcc ccggcggcgg cgacatgcgc 
1441 gacaactggc gcagcgagct gtacaagtac aaggtggtgg agatcaagcc cctgggcatc 
1 50 1 gcccccacca aggccaagcg ccgcgtggtg cagcgcgaga agcgcgccgt gggcatcggc 
1561 gccgtgttcc tgggcttcct gggcgccgcc ggcagcacca tgggcgccgc cagcatcacc 
1621 ctgaccgtgc aggcccgcca gctgctgagc ggcatcgtgc agcagcagag caacctgctg 
1681 aaggccatcg aggcccagca gcacatgctg cagctgaccg tgtggggcat caagcagctg 
1741 caggcccgcg tgctggccat cgagcgctac ctgaaggacc agcagctgct gggcatctgg 
1801 ggctgcagcg gccgcctgat ctgcaccacc gccgtgccct ggaacagcag ctggagcaac 
1861 aagagcgaga aggacatctg ggacaacatg acctggatgc agtgggaccg cgagatcagc 
1921 aactacaccg gcctgatcta caacctgctg gaggacagcc agaaccagca ggagaagaac 
1981 gagaaggacc tgctggagct ggacaagtgg aacaacctgt ggaactggtt cgacatcagc 
2041 aactggccct ggtacatcta a 
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HGURE 1 1 1 (SEQ ID NO: 190) 
gp 140mod.TVl .GM-Vl V2 



1 atgcgcgtga tgggcaccca gaagaactgc cagcagtggt ggatctgggg catcctgggc 
61 ttctggatgc tgatgatctg caacaccgag gacctgtggg tgaccgtgta ctacggcgtg 
121 cccgtgtggc gcgacgccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
181 accgaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gagatcgtgc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggccgac 
301 cagatgcacg aggacgtgat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 
361 acccccctgt gcgtgaccct gcagtgcacc gacacccagg tgaccggcca gcgcaccgtg 
42 1 accggccaga gcacccagaa cacccagggc accggcatct acaacatcga ggagatgaag 
48 1 cagtgcagct tccaggccac caccgagctg cgcgacaaga agcacaagga gtacgccctg 
541 ttctaccgcc tggacatcgt gcccctgaac gagaacagcg accagttcac ctaccgcctg 
601 atcaactgcc agaccagcac catcacccag gcctgcccca aggtgagctt cgaccccatc 
661 cccatccact actgcgcccc cgccggctac gccatcctga agtgcaacaa caagaccttc 
721 aacggcaccg gcccctgcta caacgtgagc accgtgcagt gcacccacgg catcaagccc 
781 gtggtgagca cccagctgct gctgaacggc agcctggccg aggagggcat catcatccgc 
841 agcgagaacc tgaccgagaa caccaagacc atcatcgtgc acctgaacga gagcgtggag 
901 atcaactgca cccgccccaa caacaacacc cgcaagagcg tgcgcatcgg ccccggccag 
961 gccttctacg ccaccaacga cgtgatcggc aacatccgcc aggcccactg caacatcagc 
1021 accgaccgct ggaacaagac cctgcagcag gtgatgaaga agctgggcga gcacttcccc 
1081 aacaagacca tccagttcaa gccccacgcc ggcggcgacc tggagatcac catgcacagc 
1141 ttcaactgcc gcggcgagtt cttctactgc aacaccagca acctgttcaa cagcacctac 
1201 cacagcaaca acggcaccta caagtacaac ggcaacagca gcagccccat caccctgcag 
1261 tgcaagatca agcagatcgt gcgcatgtgg cagggcgtgg gccaggccac ctacgccccc 
1321 cccatcgccg gcaacatcac ctgccgcagc aacatcaccg gcatcctgct gacccgcgac 
1381 ggcggcttca acaccaccaa caacaccgag accttccgcc ccggcggcgg cgacatgcgc 
1441 gacaactggc gcagcgagct gtacaagtac aaggtggtgg agatcaagcc cctgggcatc 
1501 gcccccacca aggccaagcg ccgcgtggtg cagcgcgaga agcgcgccgt gggcatcggc 
1561 gccgtgttcc tgggcttcct gggcgccgcc ggcagcacca tgggcgccgc cagcatcacc 
1621 ctgaccgtgc aggcccgcca gctgctgagc ggcatcgtgc agcagcagag caacctgctg 
1681 aaggccatcg aggcccagca gcacatgctg cagctgaccg tgtggggcat caagcagctg 
1741 caggcccgcg tgctggccat cgagcgctac ctgaaggacc agcagctgct gggcatctgg 
1801 ggctgcagcg gccgcctgat ctgcaccacc gccgtgccct ggaacagcag ctggagcaac 
] 86 1 aagagcgaga aggacatctg ggacaacatg acctggatgc agtgggaccg cgagatcagc 
1921 aactacaccg gcctgatcta caacctgctg gaggacagcc agaaccagca ggagaagaac 
1981 gagaaggacc tgctggagct ggacaagtgg aacaacctgt ggaactggtt cgacatcagc 
2041 aactggccct ggtacatcta a 
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FIGURE 1 12 (SEQ ID NO: 191) 

gpl40modC8.2mut7.delV2.Kozmod.Ta 

1 gccaccatgc gcgtgatggg cacccagaag aactgccagc agtggtggat ctggggcatc 
61 ctgggcttct ggatgctgat gatctgcaac accgaggacc tgtgggtgac cgtgtactac 
121 ggcgtgcccg tgtggcgcga cgccaagacc accctgttct gcgccagcga cgccaaggcc 
181 tacgagaccg aggtgcacaa cgtgtgggcc acccacgcct gcgtgcccac cgaccccaac 
241 ccccaggaga tcgtgctggg caacgtgacc gagaacttca acatgtggaa gaacgacatg 
301 gccgaccaga tgcacgagga cgtgatcagc ctgtgggacc agagcctgaa gccctgcgtg 
361 aagctgaccc ccctgtgcgt gaccctgaac tgcaccgaca ccaacgtgac cggcaaccgc 
421 accgtgaccg gcaacagcac caacaacacc aacggcaccg gcatctacaa catcgaggag 
481 atgaagaact gcagcttcaa cgccggcgcc ggccgcctga tcaactgcaa caccagcacc 
541 atcacccagg cctgccccaa ggtgagcttc gaccccatcc ccatccacta ctgcgccccc 
601 gccggctacg ccatcctgaa gtgcaacaac aagaccttca acggcaccgg cccctgctac 
661 aacgtgagca ccgtgcagtg cacccacggc atcaagcccg tggtgagcac ccagctgctg 
721 ctgaacggca gcctggccga ggagggcatc atcatccgca gcgagaacct gaccgagaac 
781 accaagacca tcatcgtgca cctgaacgag agcgtggaga tcaactgcac ccgccccaac 
841 aacaacaccc gcaagagcgt gcgcatcggc cccggccagg ccttctacgc caccaacgac 
90 1 gtgatcggca acatccgcca ggcccactgc aacatcagca ccgaccgctg gaacaagacc 
961 ctgcagcagg tgatgaagaa gctgggcgag cacttcccca acaagaccat ccagttcaag 
1021 ccccacgccg gcggcgacct ggagatcacc atgcacagct tcaactgccg cggcgagttc 
108 1 ttctactgca acaccagcaa cctgttcaac agcacctacc acagcaacaa cggcacctac 
1 141 aagtacaacg gcaacagcag cagccccatc accctgcagt gcaagatcaa gcagatcgtg 
1201 cgcatgtggc agggcgtggg ccaggccacc tacgcccccc ccatcgccgg caacatcacc 
1261 tgccgcagca acatcaccgg catcctgctg acccgcgacg gcggcttcaa caccaccaac 
1321 aacaccgaga ccttccgccc cggcggcggc gacatgcgcg acaactggcg cagcgagctg 
1381 tacaagtaca aggtggtgga gatcaagccc ctgggcatcg cccccaccaa ggccatcagc 
1441 agcgtggtgc agagcgagaa gagcgccgtg ggcatcggcg ccgtgttcct gggcttcctg 
1501 ggcgccgccg gcagcaccat gggcgccgcc agcatcaccc tgaccgtgca ggcccgccag 
1561 ctgctgagcg gcatcgtgca gcagcagagc aacctgctga aggccatcga ggcccagcag 
1621 cacatgctgc agctgaccgt gtggggcatc aagcagctgc aggcccgcgt gctggccatc 
16S 1 gagcgctacc tgaaggacca gcagctgctg ggcatctggg gctgcagcgg ccgcctgatc 
1741 tgcaccaccg ccgtgccctg gaacagcagc tggagcaaca agagcgagaa ggacatctgg 
1801 gacaacatga cctggatgca gtgggaccgc gagatcagca actacaccgg cctgatctac 
1861 aacctgctgg aggacagcca gaaccagcag gagaagaacg agaaggacct gctggagctg 
1921 gacaagtgga acaacctgtg gaactggttc gacatcagca actggccctg gtacatctaa 
1981 a 
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HGURE 1 15 (SEQ ID NO:203) 

Nef-myrD1241IAA 

ATGGCCGGCAAGTGGAGCAAGAGCAGCATCGTGGGCTGGCCCGCCGTGCGCGAGCG 

CATCCGCCGCACCGAGCCCGCCGCCGAGGGCGTGGGCGCCGeCAGCCAGGACCTGG 

ACAAGCACGGCGCCCTGACCAGCAGCAACACCGCCGCCAACAACGCCGACTGCGCC 

TGGCTGGAGGCCCAGGAGGAGGAGGAGGAGGTGGGCTTCCCCGTGCGCCCCCAGGT 

GCCCCTGCGCCCCATGACCTACAAGGCaJCCTrCGACCTGAGCTTCTTCCTGAAGGA 

GAAGGGCGGCCTGGAGGGCCTGATCTACAGCAAGAAGCGCCAGGAGATCCTGGACC 

TGTGGGTGTACCACACCCAGGGCTTCTTCCCCGGCTGGCAGAACTACACCCCCGGCC 

CCGGCGTGCGCTACCCCCTGACCTTCGGCTGGTGCrrCAAGCTGGTGCCCGTGGACC 

CCCGCGAGGTGGAGGAGGCCAACAAGGGCGAGAACAACTGCgcGgcGCACCCCATGA 

GCCAGCACGGCATGGAGGACGAGGACCGCGAGGTGCTGAAGTGGAAGTTCGACAG 

CAGCCTGGCCCGCCGCCACATGGCCCGCGAGCTGCACCCCGAGTACTACAAGGACT 

GCGCCTAA 
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HGURE 1 16 (SEQ ID NO:204) 

Nef-myrD124LLAA 

MaGKWSKSSIVGWPAVRERIRRTEPAAEGVGAASQDLDKHGALTSSNTAANNADCA 
WLEAQEEEEEVGFPVRPQVPLRPMTYKAAFDLSFFLKEKGGLEGLIYSKKRQEILDL 
WWHTQGFFPgWQ>rn"PGPGVRYPLTFGWCI^VPVDPREVEEANKGENNCaaHPM 
SQHGMEDEDREVlJCWKFDSSljyRRHMARELHPEYYKDCA 
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HGURE 117 (SEQ ID NO:205) 

gpl60mod.TV2 

1 atgcgcgccc gcggcatcct gaagaactac cgccactggt ggatctgggg catcctgggc 
61 ttctggatgc tgatgatgtg caacgtgaag ggcctgtggg tgaccgtgta ctacggcgtg 
121 cccgtgggcc gcgaggccaa gaccaccctg ttctgcgcca gcgacgccaa ggcctacgag 
181 aaggaggtgc acaacgtgtg ggccacccac gcctgcgtgc ccaccgaccc caacccccag 
241 gaggtgatcc tgggcaacgt gaccgagaac ttcaacatgt ggaagaacga catggtggac 
301 cagatgcagg aggacatcat cagcctgtgg gaccagagcc tgaagccctg cgtgaagctg 
361 acccccctgt gcgtgaccct gaactgcacc aacgccaccg tgaactacaa caacaccagc 
421 aaggacatga agaactgcag cttctacgtg accaccgagc tgcgcgacaa gaagaagaag 
48 1 gagaacgccc tgttctaccg cctggacatc gtgcccctga acaaccgcaa gaacggcaac 
541 atcaacaact accgcctgat caactgcaac accagcgcca tcacccaggc ctgccccaag 
601 gtgagcttcg accccatccc catccactac tgcgcccccg ccggctacgc ccccctgaag 
661 tgcaacaaca agaagttcaa cggcatcggc ccctgcgaca acgtgagcac cgtgcagtgc 
721 acccacggca tcaagcccgt ggtgagcacc cagctgctgc tgaacggcag cctggccgag 
781 gaggagatca tcatccgcag cgagaacctg accaacaacg tgaagaccat catcgtgcac 
841 ctgaacgaga gcatcgagat caagtgcacc cgccccggca acaacacccg caagagcgtg 
901 cgcatcggcc ccggccaggc cttctacgcc accggcgaca tcatcggcga catccgccag 
961 gcccactgca acatcagcaa gaacgagtgg aacaccaccc tgcagcgcgt gagccagaag 
1021 ctgcaggagc tgttccccaa cagcaccggc atcaagttcg ccccccacag cggcggcgac 
108 1 ctggagatca ccacccacag cttcaactgc ggcggcgagt tcttctactg caacaccacc 
1141 gacctgltca acagcaccta cagcaacggc acctgcacca acggcacctg catgagcaac 
1201 aacaccgagc gcatcaccct gcagtgccgc atcaagcaga tcatcaacat gtggcaggag 
1261 gtgggccgcg ccatgtacgc cccccccatc gccggcaaca tcacctgccg cagcaacatc 
1321 accggcctgc tgctgacccg cgacggcggc gacaacaaca ccgagaccga gaccttccgc 
1381 cccggcggcg gcgacatgcg cgacaactgg cgcagcgagc tgtacaagta caaggtggtg 
1441 gagatcaagc ccctgggcgt ggcccccacc gccgccaagc gccgcgtggt ggagcgcgag 
1501 aagcgcgccg tgggcatcgg cgccgtgttc ctgggcttcc tgggcgccgc cggcagcacc 
1561 atgggcgccg ccagcatcac cctgaccgtg caggcccgcc agctgctgag cggcatcgtg 
1621 cagcagcaga gcaacctgct gcgcgccatc gaggcccagc agcacatgct gcagctgacc 
1681 gtgtggggca tcaagcagct gcaggcccgc gtgctggcca tcgagcgcta cctgcaggac 
1741 cagcagctgc tgggcctgtg gggctgcagc ggcaagctga tctgcaccac caacgtgctg 
1801 tggaacagca gctggagcaa caagacccag agcgacatct gggacaacat gacctggatg 
1 861 cagtgggacc gcgagatcag caactacacc aacaccatct accgcctgct ggaggacagc 
1921 cagagccagc aggagcgcaa cgagaaggac ctgctggccc tggaccgctg gaacaacxtg 
1981 tggaactggt tcagcatcac caactggctg tggtacatca agatcttcat catgatcgtg 
2041 ggcggcctga tcggcctgcg catcatcttc gccgtgctga gcctggtgaa ccgcgtgcgc 
2101 cagggctaca gccccctgag cctgcagacc ctgatcccca acccccgcgg ccccgaccgc 
2161 ctgggcggca tcgaggagga gggcggcgag caggacagca gccgcagcat ccgcctggtg 
2221 agcggcttcc tgaccctggc ctgggacgac ctgcgcagcc tgtgcctgtt ctgctaccac 
2281 cgcctgcgcg acttcatcct gatcgtggtg cgcgccgtgg agctgctggg ccacagcagc 
2341 ctgcgcggcc tgcagcgcgg ctggggcacc ctgaagtacc tgggcagcct ggtgcagtac 
2401 tggggcctgg agctgaagaa gagcgccatc aacctgctgg acaccatcgc catcgccgtg 
2461 gccgagggca ccgaccgcat cctggagttc atccagaacc tgtgccgcgg catccgcaac 
2521 gtgccccgcc gcatccgcca gggcttcgag gccgccctgc agtaa 
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Figure 119 
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Figure 120 
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Figure 121 
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Figure 122 
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Figure 123 
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BOX n. OBSERVATIONS WBOERE UNITY OF INVENTION IS LACKING 

This application contains the following inventions or groups of inventions, which are not so linked as to form a single general 
inventive concept under PCT Rule 13.1. In order for all inventions to be searched, the appropriate additional search fees must be 
paid. 

Otoxtp I. claiin(5) 1 « 2, 16-61 , drawn to an expression cassette comprising a polynucleotide including HIV Gag polypeptide coirprises 
a sequence having at least 90% sequence identity to a sequence set forth in SEQ ID NO: 9, a recombinant expression system, cell or 
gene delivery vector comprising the expression cassette; methods of stimulating an immune response in a subject using the vector and 
method of producing Gag polypeptide. 

Group II. claim(s) 3, 4. 15-32, 34-61, drawn to an expression cassette comprising a polynucleotide including HIV Env polypeptide 
comprises a sequence having at least 90% sequence identity to a sequence set forth in SEQ ID NO: 21 , a recombusant expression 
system, cell or gene delivery vector comprising the expression cassette; methods of stimulating an iwmniiM* response in a subject using 
the vector. 

Group III, claim(s) 5, 6, 16-32, 34-61, drawn lo an expression cassette comprising a polynucleotide including HIV Int polypeptide 
comprises a sequence having at least 95% sequence identity to a sequence set forth in SEQ ID NO: 39, a recombinant e]q>ression 
system, cell or gene delivery vector comprising the eiqiression cassette; methods of stinsilating an ttnnmng response in a subject using 
the vector. 

Groip IV, claim(s) 7, 16-32. 34^1, drawn to an es^ression cassette coniprising a polynucleotide ijKdutfing HIV Nef polyp^tide 
con]prises a sequence having at least 90% sequence identity to a sequence set forth in SEQ ID NO: 41 , a recombinant e]q>ression 
system, cell or gene delivery vector conqprising die expression cassette; methods of stimnlating an imfniinft response in a subject usu^ 
the vector. 

Group V, claim(s) 8, 16-32, 34-61 , drawn to an expression cassette comprising a polynucleotide including HIV pl5RnaseH 
polypeptide comprises a sequence having at least 95% sequence identity to a sequence set forth in SEQ ID NO: 43, a recombinant 
expression system, cell or gene delivery vector comprising the expression cassette; methods of stimulating an immune, response in a 
subject using the vector. 

Group VT. claim(s) 9, 16-32, 34-61 , drawn to an expression cassette con5)rising a polynucleotide including HIV Pol polypeptide, a 
recombinant expression system, cell or gene delivery vector coniprising the expression cassette; methods of stimulating an immune 
response in a subject using the vector. 

Groi^ VII. claim(s) 10, 14, 15-32, 34-61, drawn to an expression cassette con5)rising a polynucleotide including HIV Tat 
polypeptide comprises a sequence having at least 90% sequence identity to a sequence set forth in SEQ ID NO: 46, a recombinant 
expression system, cell or gene delivery vector comprising the e]q»ression cassette; methods of stimulating an imnmne response in a 

subject using the vector. 

Group VIII. claim(s) 11, 12. 16-32, 34-61 . drawn to an expression cassette comprising a polymjcleoiide including HIV Prot 
polypeptide comprises a sequence having at least 95% sequence identity to a sequence set forth in SEQ ID NO: 49, a recombinant 
expression system, cell or gene delivery vector comprising the expression cassette; methods of stimulating an immune response in a 
subject using the vector. 

Group IX, claira(s) 13, 16-32, 34-61 , drawn to an expression cassette comprising a polynucleotide including HIV Rev polypeptide 
comprises a sequence having at least 90% sequence identity to a sequence set forth in SEQ ID NO: 52, a recombinant expression 
system, cell or gene delivery vector comiprising the expression cassette; methods of stimulating an immune response in a subject using 

the vector. 

This application contains claims directed to more than one species of the generic invention. These species are deemed to lackimiiy of 
invention because they are not so linked as to form a single general inventive concept under PCT Rule 13. 1 . 
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In order for more thao one species to be examined, (he appropriate additional examinatioD fees must be paid. The species are as 

follows: 

a) nucleic acid encoding a HIV Gag polypeptide set forth in SEQ ID NO: 10-20; 

b) nucleic acid encoding a HIV Env polypeptide set forth in SEQ ID NO: 22-38 and 183- 191; 

c) nucleic acid encoding a HIV Im polypeptide having at least 98% sequence identity to SEQ ID NO: 40; 

d) nucleic acid encodiog a HIV Nef polyp^tide set forth in SEQ ID NO: 203; 

e) nucleic acid encoding a HIV Pol polypeptide set forth in SEQ ID NO: 44 and 45; 

0 nucleic acid encoding a HIV Tat polypeptide set forth in SEQ ID NO: 47. 48, and 53-60; 
g) nucleic acid encoding a HIV Prot polypeptide set forth in SEQ ID NO: 50 

Applicant is reminded that up to ten (10) nucleotide sequence will be searched in a PCT 
application. See Examination of Patent Applications Containing Nucleotide Sequences, IISKS 
O.G. 68 (November 19, 1996). 

The claims are deemed to correspond to the species listed above in the following manner: 

Claims 1.2. and 16-61 correspond to the species of a). 
Claims 3, 4. 15'32. 34-61 correspond to the species of b). 
Claim 6 and 16-32, 34-61 correspond to the species of c). 
Claims 7 and 16-32, 34-61 correspond to the species of d). 
Claims 9 and 16-32, 34-61 correspond to the species of e). 
Claims 10, 14,16-32, 34-61 correspond to the species of f). 
Claims 11, 12, 16-32, 34-61 correspond to the species of g). 



The folloudng claim(s) are generic: claims 1-7, 9-12, and 14-61. 

The inventions listed as Groups MX do not relate to a single general inventive concq>t under PCT Rule 13.1 because, under PCT 
Rule 13.2, they lack the same or corresponding special technical features for the foUoNving reasons:. 

The special technical feature of Groxtp \ is considered to be an expression cassette comprising a polynucleotide sequence encoding a 
polypeptide including Gag polypeptide. 

The special technical feature of Group II is considered to be an expression cassette comprising a polynucleotide sequence encoding a 
polypeptide including Env polypeptide. 

The special technical feature of Group III is considered to be an expression cassette comprising a polynucleotide sequence encoding a 
polypeptide including I at polypeptide. 

The special technical feature of Group IV is considered to be an e;q)ression cassette comprising a polynucleotide sequence encoding a 
polypeptide including Nef polypeptide. 

The special technical feature of Group V is considered to be an expression cassette comprising a polynucleotide sequence encoding a 
polypeptide inchiding pISRNaseH polypeptide. 

The special technical feature of Group VI is considered to be an eiq»ression cassette comprising a polynucleotide sequence encoding a 
polypq)tide including Pol polypeptide. 

The special technical feature of Group VII is considered to be an ejq}ression cassette comprising a polymdeotide sequence encoding a 
polypeptide including Tat polypeptide. 

The special technical feature of Groip Vni is considered to be an expression cassette comprising a polynucleotide sequence encoding 
a polypeptide including Prot polypeptide. 

The special technical feature of Group IX is considered to be an e^qpression cassette comprising a polynucleotide sequence encoding a 
polypeptide including Rev polypeptide. 

Accordingly. Groups MX are not so linked by the same or a corresponding technical feature as to form a single general inventive 
concept. 

The species listed above do not relate to a single general inventive concept under PCT Rule 13. 1 because, under PCT Rule 13.2, the 
species lack the same or corresponding special technical features for the following reasons: 

The species listed above do not relate to a single general inventive concept under PCT Rule 13,1 because, under PCT Rule 
13,2. they lack the same or corresponding special technical features for the following reasons: the species of (a) SEQ ID NOs: 10-20 
are different structurally and/or functionally, (b) SEQ ID NOs: 22-38 and 183-191 are different structurally and/or functionally, (c) 
SEQ ID NO; 40 is different smiciurally and/or functionally, d) SEQ ID NO: 203 different strucmrally and/or f\mcdonally, e) SEQ ID 
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NOs: 44 and 45 are different structurally and/or functionally, f) SEQ ID NOs: 47,48, 53-60 are different structurally and/or 
functionally, g) SEQ ID NO: 50 is different structurally and/or fimctiooally and the PCX Rules for Lack of Unity do not apply. 
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